Moving objects detection from UAV captured videos using trajectories of matched regional adjacency graphs

Videos captured using cameras from unmanned aerial vehicles (UAV) normally produce dynamic footage that commonly contains unstable camera motion with multiple moving objects. These objects are sometimes occluded by vegetation or even other objects, which presents a challenging environment for...

Full description

Saved in:
Bibliographic Details
Main Author: Harandi, Bahareh Kalantar Ghorashi
Format: Thesis
Language:English
Published: 2017
Online Access:http://psasir.upm.edu.my/id/eprint/68539/1/FK%202018%2024%20-%20IR.pdf
http://psasir.upm.edu.my/id/eprint/68539/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Videos captured using cameras from unmanned aerial vehicles (UAV) normally produce dynamic footage that commonly contains unstable camera motion with multiple moving objects. These objects are sometimes occluded by vegetation or even other objects, which presents a challenging environment for higher level video processing and analysis. This thesis deals with the topic of moving object detection (MOD) whose intention is to identify and detect single or multiple moving objects from video. In the past, MOD was mainly tackled using image registration, which discovers correspondences between consecutive frames using pair-wise grayscale spatial visual appearance matching under rigid and affine transformations. However, traditional image registration is unsuitable for UAV captured videos since distancebased grayscale similarity fails to cater for the dynamic spatio-temporal differences of moving objects. Registration is also ineffective when dealing with object occlusion. This thesis therefore proposes a framework to address these issues through a two-step approach involving region matching and region labeling. Specifically, the objectives of this thesis are (i) to develop an image registration technique based on multigraph matching, (ii) to detect occluded objects through exploration of candidate object correspondences in longer frame sequences, and (iii) to develop a robust graph coloring algorithm for multiple moving object detection under different transformations. In general, each frame of the footage will firstly be segmented into superpixel regions where appearance and geometrical features are calculated. Trajectory information is also considered across multiple frames taking into account many types of transformations. Specifically, each frame is modeled/represented as a regional adjacency graph (RAG). Then, instead of pair-wise spatial matching as with image registration, correspondences between video frames are discovered through multigraph matching of robust spatio-temporal features of each region. Since more than two frames are considered at one time, this step is able to discover better region correspondences as well as caters for object(s) occlusion. The second step of region labeling relies on the assumption that background and foreground moving objects exhibit different motions properties when in a sequence. Therefore, their spatial difference is expected to drastically differ over time. Banking on this, region labeling assigns the labels of either background or foreground region based on a proposed graph coloring algorithm, which considers trajectory-based features. Overall, the framework consisting of these two steps is termed as Motion Differences of Matched Region-based Features (MDMRBF). MDMRBF has been evaluated against two datasets namely the (i) Defense Advanced Research Projects Agency (DARPA) Video Verification of Identity (VIVID) dataset and (ii) two self-captured videos using a mounted camera on a UAV. Precision and recall are used as the criteria to quantitatively evaluate and validate overall MOD performance. Furthermore, both are computed with respect to the ground-truth data which are manually annotated for the video sequences. The proposed framework has also been compared with existing stateof- the-art detection algorithms. Experimental results show that MDMRBF outperforms these algorithms with precision and recall being 94% and 89%, respectively. These results can be attributed to the integration of appearance and geometrical constraints for region matching using the multigraph structure. Moreover, the consideration of longer trajectories on multiple frames and taking into account all the transformations also facilitated in resolving occlusion. With regards to time, the proposed approach could detect moving objects within one minute for a 30-second sequence, which means that it is efficient in practice. In conclusion, the multiple moving object detection technique proposed in this study is robust to unknown transformations, with significant improvements in overall precision and recall compared to existing methods. The proposed algorithm is designed in order to tackle many limitations of the existing algorithms such as handle inevitable occlusions, model different motions from multiple moving objects, and consider the spatial information.