Video surveillance using deep learning with few data samples

Data quantity is the essential element in determining the performance of an object detector as well as the performance of a video surveillance system. However, the availability of annotated image dataset for certain target domains is often limited. In a video surveillance system, the detector will u...

Full description

Saved in:
Bibliographic Details
Main Author: Lian, Yee Fu
Format: Final Year Project / Dissertation / Thesis
Published: 2020
Subjects:
Online Access:http://eprints.utar.edu.my/3891/1/16ACB05136_FYP.pdf
http://eprints.utar.edu.my/3891/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Data quantity is the essential element in determining the performance of an object detector as well as the performance of a video surveillance system. However, the availability of annotated image dataset for certain target domains is often limited. In a video surveillance system, the detector will usually have to detect new objects with limited annotated datasets due to the rapid changes of detecting requirements. As a reliable and flexible video surveillance system, the system should be able to perform object detection with limited annotated images provided yet with relatively good performance. Therefore, previous work on object detection framework using deep learning with few data samples is studied, implemented and improved. The improvements are performed to the existing framework to result in higher object detection performances with few data samples. In this project, a feature extractor (YOLOv2) will be implemented with a re-weighting module which is used to re- weight the feature extracted from feature extractor in order to detect N classes objects (including new classes). By having this architecture, the model is able to reuse the prior knowledge on general object features (edges and corners) from feature extractor and combine with the class-specific feature from the re-weighting module. Since the amount of data is the key determinant on the performance of the object detector, therefore improvement is done by increasing the amount of novel annotated images with different data augmentations before model fine tuning to detect new objects. Nevertheless, the re-weighting module is modified so that more information is captured to re-weight the features from the feature extractor in order to detect new class objects. The experiment results showed that data augmentation and modified re-weighting module achieved higher mean average precision (mAP) because fine tuning data is increased and more re-weighting information is able to be captured.