Staff View: Multi level refinement enriched feature pyramid network for scale and class imbalance in object detection

Multi level refinement enriched feature pyramid network for scale and class imbalance in object detection

Object detection becomes challenging due to feature unbalancing, less contextual information and class imbalance. The feature pyramid has been used to learn multiscale representation in modern detectors. However, the current version of the feature pyramid failed to integrate useful semantic informat...

Full description

Saved in:

Bibliographic Details
Main Author:	Aziz, Lubna
Format:	Thesis
Language:	English
Published:	2022
Subjects:	QA75 Electronic computers. Computer science
Online Access:	http://eprints.utm.my/id/eprint/101479/1/LubnaAzizPSC2022.pdf.pdf http://eprints.utm.my/id/eprint/101479/ http://dms.library.utm.my:8080/vital/access/manager/Repository/vital:150788
Tags:	Add Tag No Tags, Be the first to tag this record!

id	my.utm.101479
record_format	eprints
spelling	my.utm.1014792023-06-21T10:10:15Z http://eprints.utm.my/id/eprint/101479/ Multi level refinement enriched feature pyramid network for scale and class imbalance in object detection Aziz, Lubna QA75 Electronic computers. Computer science Object detection becomes challenging due to feature unbalancing, less contextual information and class imbalance. The feature pyramid has been used to learn multiscale representation in modern detectors. However, the current version of the feature pyramid failed to integrate useful semantic information across different scales. In addition, many negative anchors are generated during training, resulting in extreme class imbalance. This study proposed a Multi-Level Refinement Enriched Feature Pyramid Network (MREFP-Net) to jointly handle feature-level scale imbalance and class imbalance in object detection. Instead of designing a complex approach, a simple and effective multi-layered feature enrichment scheme was proposed that effectively combines deep, intermediate, and shallow features to obtain important semantic and spatial information for small object detection. In addition, a chained parallel pooling was proposed to capture rich background contextual information. A cascaded anchor refinement scheme was introduced to integrate useful multiscale contextual information into Single Shot MultiBox Detector's prediction layers to improve the multiscale detection's distinctiveness. The ultimate goal of the cascaded anchor refinement scheme was to counteract the class imbalance by refining anchors and enriching contextual features to improve regression and classification. The performance of MREFP-Net was evaluated using two benchmark datasets, MSCOCO and PASCAL VOC 07/ 12. For a 300 × 300 input on MS-COCO test-dev, MREFP-Net-ResNet101 achieved a state-of-the-art detection accuracy ???? of 36.6 with single-scale inference strategy and 39.2 ms on RTX 2060 GPU. For a 512 × 512 input on MS-COCO test-dev, MREFP-Net obtained an absolute gain of 2.5%. In particular, the results of MREFP-Net-VGG were benchmarked with 800 × 800 input on MS COCO test-dev: 49.2 ???? with a multiscale inference strategy. For 300 × 300 input, MREFP-Net achieved 82.5% ?????? on VOC07+12+COCO, and for 512 × 512 input, MREFP-Net obtained 84.6% ??????. Finally, feature visualization, object characteristic analysis and false-positive error analysis were performed to highlight the effectiveness of enriched features for small object detection. This study has proven that the proposed MREFP-Net was capable of detecting small objects and learning sensitive features to deal with scale, class imbalances, and appearance complexity across object instances. 2022 Thesis NonPeerReviewed application/pdf en http://eprints.utm.my/id/eprint/101479/1/LubnaAzizPSC2022.pdf.pdf Aziz, Lubna (2022) Multi level refinement enriched feature pyramid network for scale and class imbalance in object detection. PhD thesis, Universiti Teknologi Malaysia. http://dms.library.utm.my:8080/vital/access/manager/Repository/vital:150788
institution	Universiti Teknologi Malaysia
building	UTM Library
collection	Institutional Repository
continent	Asia
country	Malaysia
content_provider	Universiti Teknologi Malaysia
content_source	UTM Institutional Repository
url_provider	http://eprints.utm.my/
language	English
topic	QA75 Electronic computers. Computer science
spellingShingle	QA75 Electronic computers. Computer science Aziz, Lubna Multi level refinement enriched feature pyramid network for scale and class imbalance in object detection
description	Object detection becomes challenging due to feature unbalancing, less contextual information and class imbalance. The feature pyramid has been used to learn multiscale representation in modern detectors. However, the current version of the feature pyramid failed to integrate useful semantic information across different scales. In addition, many negative anchors are generated during training, resulting in extreme class imbalance. This study proposed a Multi-Level Refinement Enriched Feature Pyramid Network (MREFP-Net) to jointly handle feature-level scale imbalance and class imbalance in object detection. Instead of designing a complex approach, a simple and effective multi-layered feature enrichment scheme was proposed that effectively combines deep, intermediate, and shallow features to obtain important semantic and spatial information for small object detection. In addition, a chained parallel pooling was proposed to capture rich background contextual information. A cascaded anchor refinement scheme was introduced to integrate useful multiscale contextual information into Single Shot MultiBox Detector's prediction layers to improve the multiscale detection's distinctiveness. The ultimate goal of the cascaded anchor refinement scheme was to counteract the class imbalance by refining anchors and enriching contextual features to improve regression and classification. The performance of MREFP-Net was evaluated using two benchmark datasets, MSCOCO and PASCAL VOC 07/ 12. For a 300 × 300 input on MS-COCO test-dev, MREFP-Net-ResNet101 achieved a state-of-the-art detection accuracy ???? of 36.6 with single-scale inference strategy and 39.2 ms on RTX 2060 GPU. For a 512 × 512 input on MS-COCO test-dev, MREFP-Net obtained an absolute gain of 2.5%. In particular, the results of MREFP-Net-VGG were benchmarked with 800 × 800 input on MS COCO test-dev: 49.2 ???? with a multiscale inference strategy. For 300 × 300 input, MREFP-Net achieved 82.5% ?????? on VOC07+12+COCO, and for 512 × 512 input, MREFP-Net obtained 84.6% ??????. Finally, feature visualization, object characteristic analysis and false-positive error analysis were performed to highlight the effectiveness of enriched features for small object detection. This study has proven that the proposed MREFP-Net was capable of detecting small objects and learning sensitive features to deal with scale, class imbalances, and appearance complexity across object instances.
format	Thesis
author	Aziz, Lubna
author_facet	Aziz, Lubna
author_sort	Aziz, Lubna
title	Multi level refinement enriched feature pyramid network for scale and class imbalance in object detection
title_short	Multi level refinement enriched feature pyramid network for scale and class imbalance in object detection
title_full	Multi level refinement enriched feature pyramid network for scale and class imbalance in object detection
title_fullStr	Multi level refinement enriched feature pyramid network for scale and class imbalance in object detection
title_full_unstemmed	Multi level refinement enriched feature pyramid network for scale and class imbalance in object detection
title_sort	multi level refinement enriched feature pyramid network for scale and class imbalance in object detection
publishDate	2022
url	http://eprints.utm.my/id/eprint/101479/1/LubnaAzizPSC2022.pdf.pdf http://eprints.utm.my/id/eprint/101479/ http://dms.library.utm.my:8080/vital/access/manager/Repository/vital:150788
_version_	1769842061142392832
score	13.188404

Multi level refinement enriched feature pyramid network for scale and class imbalance in object detection

Similar Items