Real-Time Human Detection for Aerial Captured Video Sequences via Deep Models

Human detection in videos plays an important role in various real life applications. Most of traditional approaches depend on utilizing handcrafted features which are problem-dependent and optimal for specific tasks. Moreover, they are highly susceptible to dynamical events such as illumination chan...

Full description

Saved in:
Bibliographic Details
Main Authors: AlDahoul, Nouar, Sabri, Aznul Qalid Md, Mansoor, Ali Mohammed
Format: Article
Published: Hindawi 2018
Subjects:
Online Access:http://eprints.um.edu.my/22582/
https://doi.org/10.1155/2018/1639561
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.um.eprints.22582
record_format eprints
spelling my.um.eprints.225822019-09-26T06:31:16Z http://eprints.um.edu.my/22582/ Real-Time Human Detection for Aerial Captured Video Sequences via Deep Models AlDahoul, Nouar Sabri, Aznul Qalid Md Mansoor, Ali Mohammed QA75 Electronic computers. Computer science Human detection in videos plays an important role in various real life applications. Most of traditional approaches depend on utilizing handcrafted features which are problem-dependent and optimal for specific tasks. Moreover, they are highly susceptible to dynamical events such as illumination changes, camera jitter, and variations in object sizes. On the other hand, the proposed feature learning approaches are cheaper and easier because highly abstract and discriminative features can be produced automatically without the need of expert knowledge. In this paper, we utilize automatic feature learning methods which combine optical flow and three different deep models (i.e., supervised convolutional neural network (S-CNN), pretrained CNN feature extractor, and hierarchical extreme learning machine) for human detection in videos captured using a nonstatic camera on an aerial platform with varying altitudes. The models are trained and tested on the publicly available and highly challenging UCF-ARG aerial dataset. The comparison between these models in terms of training, testing accuracy, and learning speed is analyzed. The performance evaluation considers five human actions (digging, waving, throwing, walking, and running). Experimental results demonstrated that the proposed methods are successful for human detection task. Pretrained CNN produces an average accuracy of 98.09%. S-CNN produces an average accuracy of 95.6% with soft-max and 91.7% with Support Vector Machines (SVM). H-ELM has an average accuracy of 95.9%. Using a normal Central Processing Unit (CPU), H-ELM's training time takes 445 seconds. Learning in S-CNN takes 770 seconds with a high performance Graphical Processing Unit (GPU). Hindawi 2018 Article PeerReviewed AlDahoul, Nouar and Sabri, Aznul Qalid Md and Mansoor, Ali Mohammed (2018) Real-Time Human Detection for Aerial Captured Video Sequences via Deep Models. Computational Intelligence and Neuroscience, 2018. pp. 1-14. ISSN 1687-5265 https://doi.org/10.1155/2018/1639561 doi:10.1155/2018/1639561
institution Universiti Malaya
building UM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Malaya
content_source UM Research Repository
url_provider http://eprints.um.edu.my/
topic QA75 Electronic computers. Computer science
spellingShingle QA75 Electronic computers. Computer science
AlDahoul, Nouar
Sabri, Aznul Qalid Md
Mansoor, Ali Mohammed
Real-Time Human Detection for Aerial Captured Video Sequences via Deep Models
description Human detection in videos plays an important role in various real life applications. Most of traditional approaches depend on utilizing handcrafted features which are problem-dependent and optimal for specific tasks. Moreover, they are highly susceptible to dynamical events such as illumination changes, camera jitter, and variations in object sizes. On the other hand, the proposed feature learning approaches are cheaper and easier because highly abstract and discriminative features can be produced automatically without the need of expert knowledge. In this paper, we utilize automatic feature learning methods which combine optical flow and three different deep models (i.e., supervised convolutional neural network (S-CNN), pretrained CNN feature extractor, and hierarchical extreme learning machine) for human detection in videos captured using a nonstatic camera on an aerial platform with varying altitudes. The models are trained and tested on the publicly available and highly challenging UCF-ARG aerial dataset. The comparison between these models in terms of training, testing accuracy, and learning speed is analyzed. The performance evaluation considers five human actions (digging, waving, throwing, walking, and running). Experimental results demonstrated that the proposed methods are successful for human detection task. Pretrained CNN produces an average accuracy of 98.09%. S-CNN produces an average accuracy of 95.6% with soft-max and 91.7% with Support Vector Machines (SVM). H-ELM has an average accuracy of 95.9%. Using a normal Central Processing Unit (CPU), H-ELM's training time takes 445 seconds. Learning in S-CNN takes 770 seconds with a high performance Graphical Processing Unit (GPU).
format Article
author AlDahoul, Nouar
Sabri, Aznul Qalid Md
Mansoor, Ali Mohammed
author_facet AlDahoul, Nouar
Sabri, Aznul Qalid Md
Mansoor, Ali Mohammed
author_sort AlDahoul, Nouar
title Real-Time Human Detection for Aerial Captured Video Sequences via Deep Models
title_short Real-Time Human Detection for Aerial Captured Video Sequences via Deep Models
title_full Real-Time Human Detection for Aerial Captured Video Sequences via Deep Models
title_fullStr Real-Time Human Detection for Aerial Captured Video Sequences via Deep Models
title_full_unstemmed Real-Time Human Detection for Aerial Captured Video Sequences via Deep Models
title_sort real-time human detection for aerial captured video sequences via deep models
publisher Hindawi
publishDate 2018
url http://eprints.um.edu.my/22582/
https://doi.org/10.1155/2018/1639561
_version_ 1646210282586898432
score 13.18916