Lightweight one-stage object detection models for machine vision based recognition system / Mohamad Haniff Junos

Object detection is an important research area of computer vision and has been applied in numerous real-world applications. Object detection is a complicated task requiring locating and identifying targets with significant variations in shape, size, viewpoint, high obscurity, and complex backgrounds...

Full description

Saved in:
Bibliographic Details
Main Author: Mohamad Haniff, Junos
Format: Thesis
Published: 2022
Subjects:
Online Access:http://studentsrepo.um.edu.my/14908/2/Mohamad_Haniff.pdf
http://studentsrepo.um.edu.my/14908/1/Mohamad_Haniff.pdf
http://studentsrepo.um.edu.my/14908/
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.um.stud.14908
record_format eprints
spelling my.um.stud.149082024-05-20T23:46:02Z Lightweight one-stage object detection models for machine vision based recognition system / Mohamad Haniff Junos Mohamad Haniff, Junos TK Electrical engineering. Electronics Nuclear engineering Object detection is an important research area of computer vision and has been applied in numerous real-world applications. Object detection is a complicated task requiring locating and identifying targets with significant variations in shape, size, viewpoint, high obscurity, and complex backgrounds. Many approaches have been developed to address these challenges with a remarkable increase in data availability and computational resources. While current state-of-the-art one-stage object detection methods achieve high detection accuracy, they are unsuitable for embedded devices due to their complex structure and extensive network parameters. Therefore, this study proposes two lightweight object detection models based on the You Only Look Once (YOLO) tiny model to improve detection accuracy, reduce the model size, and achieve real-time performance. For the proposed model I: YOLO-A model, a hybrid backbone structure based on densely connected neural network (DesneNet) and mobile inverted bottleneck module (MBConv) were adopted into the YOLOv3 tiny model for better feature reuse and to reduce the network’s parameters. Then, the swish activation function replaced the Leaky ReLU function to enhance the expression of input data and weight to be learnt. Besides, four detection layers were used to improve object detection of various sizes. Then, a spatial pyramid pooling (SPP) structure was adopted to increase the receptive field in the network, and a complete intersection over union (CIoU) function was used for bounding box regression. The proposed model II: YOLO-S adopted several improvements, including the integration of MBConv into the YOLOv4 tiny model, SPP in the neck section and three detection layers. The developed models were evaluated on the Palm and VisDrone dataset, and the results were compared to several state-of-the-art models in terms of detection performance. The experimental results indicate that the developed models achieved significant improvement over the original YOLO tiny model. On the Palm dataset, the experimental results demonstrated that the proposed YOLO-A model outperformed the state-of-the-art models with an mAP of 97.29%, while the YOLO-S model achieved better accuracy than the other lightweight models with an mAP of 96.41%. Conversely, on the VisDrone dataset, the proposed YOLO-A model obtained slightly lower mAP (36.15%) than the YOLOv4 model but higher than the rest. This is due to the VisDrone dataset, consisting of many challenging images. Besides, the YOLO-S model achieved better mAP than the other lightweight models with 24.73%. The detection speed and computational performance were evaluated on the VisDrone dataset. Results show that both models significantly improve the computational performance where the YOLO-A model generated a model size of 65.6 MB with BFLOPs of 28.416, while the YOLO-S model produced the smallest size with 13.3 MB and BFLOPs of 6.280. Furthermore, both models achieved better detection speed with 3.4 FPS and 17.3 FPS when tested on the Jetson Nano. The proposed models offer the best trade-offs in detection accuracy, model size, and detection time. The comprehensive results exhibit the superiority of the proposed models over several existing state-of-the-art detection models and have the potential to be effectively deployed on the limited capacity embedded devices. 2022-12 Thesis NonPeerReviewed application/pdf http://studentsrepo.um.edu.my/14908/2/Mohamad_Haniff.pdf application/pdf http://studentsrepo.um.edu.my/14908/1/Mohamad_Haniff.pdf Mohamad Haniff, Junos (2022) Lightweight one-stage object detection models for machine vision based recognition system / Mohamad Haniff Junos. PhD thesis, Universiti Malaya. http://studentsrepo.um.edu.my/14908/
institution Universiti Malaya
building UM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Malaya
content_source UM Student Repository
url_provider http://studentsrepo.um.edu.my/
topic TK Electrical engineering. Electronics Nuclear engineering
spellingShingle TK Electrical engineering. Electronics Nuclear engineering
Mohamad Haniff, Junos
Lightweight one-stage object detection models for machine vision based recognition system / Mohamad Haniff Junos
description Object detection is an important research area of computer vision and has been applied in numerous real-world applications. Object detection is a complicated task requiring locating and identifying targets with significant variations in shape, size, viewpoint, high obscurity, and complex backgrounds. Many approaches have been developed to address these challenges with a remarkable increase in data availability and computational resources. While current state-of-the-art one-stage object detection methods achieve high detection accuracy, they are unsuitable for embedded devices due to their complex structure and extensive network parameters. Therefore, this study proposes two lightweight object detection models based on the You Only Look Once (YOLO) tiny model to improve detection accuracy, reduce the model size, and achieve real-time performance. For the proposed model I: YOLO-A model, a hybrid backbone structure based on densely connected neural network (DesneNet) and mobile inverted bottleneck module (MBConv) were adopted into the YOLOv3 tiny model for better feature reuse and to reduce the network’s parameters. Then, the swish activation function replaced the Leaky ReLU function to enhance the expression of input data and weight to be learnt. Besides, four detection layers were used to improve object detection of various sizes. Then, a spatial pyramid pooling (SPP) structure was adopted to increase the receptive field in the network, and a complete intersection over union (CIoU) function was used for bounding box regression. The proposed model II: YOLO-S adopted several improvements, including the integration of MBConv into the YOLOv4 tiny model, SPP in the neck section and three detection layers. The developed models were evaluated on the Palm and VisDrone dataset, and the results were compared to several state-of-the-art models in terms of detection performance. The experimental results indicate that the developed models achieved significant improvement over the original YOLO tiny model. On the Palm dataset, the experimental results demonstrated that the proposed YOLO-A model outperformed the state-of-the-art models with an mAP of 97.29%, while the YOLO-S model achieved better accuracy than the other lightweight models with an mAP of 96.41%. Conversely, on the VisDrone dataset, the proposed YOLO-A model obtained slightly lower mAP (36.15%) than the YOLOv4 model but higher than the rest. This is due to the VisDrone dataset, consisting of many challenging images. Besides, the YOLO-S model achieved better mAP than the other lightweight models with 24.73%. The detection speed and computational performance were evaluated on the VisDrone dataset. Results show that both models significantly improve the computational performance where the YOLO-A model generated a model size of 65.6 MB with BFLOPs of 28.416, while the YOLO-S model produced the smallest size with 13.3 MB and BFLOPs of 6.280. Furthermore, both models achieved better detection speed with 3.4 FPS and 17.3 FPS when tested on the Jetson Nano. The proposed models offer the best trade-offs in detection accuracy, model size, and detection time. The comprehensive results exhibit the superiority of the proposed models over several existing state-of-the-art detection models and have the potential to be effectively deployed on the limited capacity embedded devices.
format Thesis
author Mohamad Haniff, Junos
author_facet Mohamad Haniff, Junos
author_sort Mohamad Haniff, Junos
title Lightweight one-stage object detection models for machine vision based recognition system / Mohamad Haniff Junos
title_short Lightweight one-stage object detection models for machine vision based recognition system / Mohamad Haniff Junos
title_full Lightweight one-stage object detection models for machine vision based recognition system / Mohamad Haniff Junos
title_fullStr Lightweight one-stage object detection models for machine vision based recognition system / Mohamad Haniff Junos
title_full_unstemmed Lightweight one-stage object detection models for machine vision based recognition system / Mohamad Haniff Junos
title_sort lightweight one-stage object detection models for machine vision based recognition system / mohamad haniff junos
publishDate 2022
url http://studentsrepo.um.edu.my/14908/2/Mohamad_Haniff.pdf
http://studentsrepo.um.edu.my/14908/1/Mohamad_Haniff.pdf
http://studentsrepo.um.edu.my/14908/
_version_ 1800715312966926336
score 13.160551