Multi level refinement enriched feature pyramid network for scale and class imbalance in object detection

Object detection becomes challenging due to feature unbalancing, less contextual information and class imbalance. The feature pyramid has been used to learn multiscale representation in modern detectors. However, the current version of the feature pyramid failed to integrate useful semantic informat...

Full description

Saved in:
Bibliographic Details
Main Author: Aziz, Lubna
Format: Thesis
Language:English
Published: 2022
Subjects:
Online Access:http://eprints.utm.my/id/eprint/101479/1/LubnaAzizPSC2022.pdf.pdf
http://eprints.utm.my/id/eprint/101479/
http://dms.library.utm.my:8080/vital/access/manager/Repository/vital:150788
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.utm.101479
record_format eprints
spelling my.utm.1014792023-06-21T10:10:15Z http://eprints.utm.my/id/eprint/101479/ Multi level refinement enriched feature pyramid network for scale and class imbalance in object detection Aziz, Lubna QA75 Electronic computers. Computer science Object detection becomes challenging due to feature unbalancing, less contextual information and class imbalance. The feature pyramid has been used to learn multiscale representation in modern detectors. However, the current version of the feature pyramid failed to integrate useful semantic information across different scales. In addition, many negative anchors are generated during training, resulting in extreme class imbalance. This study proposed a Multi-Level Refinement Enriched Feature Pyramid Network (MREFP-Net) to jointly handle feature-level scale imbalance and class imbalance in object detection. Instead of designing a complex approach, a simple and effective multi-layered feature enrichment scheme was proposed that effectively combines deep, intermediate, and shallow features to obtain important semantic and spatial information for small object detection. In addition, a chained parallel pooling was proposed to capture rich background contextual information. A cascaded anchor refinement scheme was introduced to integrate useful multiscale contextual information into Single Shot MultiBox Detector's prediction layers to improve the multiscale detection's distinctiveness. The ultimate goal of the cascaded anchor refinement scheme was to counteract the class imbalance by refining anchors and enriching contextual features to improve regression and classification. The performance of MREFP-Net was evaluated using two benchmark datasets, MSCOCO and PASCAL VOC 07/ 12. For a 300 × 300 input on MS-COCO test-dev, MREFP-Net-ResNet101 achieved a state-of-the-art detection accuracy ???? of 36.6 with single-scale inference strategy and 39.2 ms on RTX 2060 GPU. For a 512 × 512 input on MS-COCO test-dev, MREFP-Net obtained an absolute gain of 2.5%. In particular, the results of MREFP-Net-VGG were benchmarked with 800 × 800 input on MS COCO test-dev: 49.2 ???? with a multiscale inference strategy. For 300 × 300 input, MREFP-Net achieved 82.5% ?????? on VOC07+12+COCO, and for 512 × 512 input, MREFP-Net obtained 84.6% ??????. Finally, feature visualization, object characteristic analysis and false-positive error analysis were performed to highlight the effectiveness of enriched features for small object detection. This study has proven that the proposed MREFP-Net was capable of detecting small objects and learning sensitive features to deal with scale, class imbalances, and appearance complexity across object instances. 2022 Thesis NonPeerReviewed application/pdf en http://eprints.utm.my/id/eprint/101479/1/LubnaAzizPSC2022.pdf.pdf Aziz, Lubna (2022) Multi level refinement enriched feature pyramid network for scale and class imbalance in object detection. PhD thesis, Universiti Teknologi Malaysia. http://dms.library.utm.my:8080/vital/access/manager/Repository/vital:150788
institution Universiti Teknologi Malaysia
building UTM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Teknologi Malaysia
content_source UTM Institutional Repository
url_provider http://eprints.utm.my/
language English
topic QA75 Electronic computers. Computer science
spellingShingle QA75 Electronic computers. Computer science
Aziz, Lubna
Multi level refinement enriched feature pyramid network for scale and class imbalance in object detection
description Object detection becomes challenging due to feature unbalancing, less contextual information and class imbalance. The feature pyramid has been used to learn multiscale representation in modern detectors. However, the current version of the feature pyramid failed to integrate useful semantic information across different scales. In addition, many negative anchors are generated during training, resulting in extreme class imbalance. This study proposed a Multi-Level Refinement Enriched Feature Pyramid Network (MREFP-Net) to jointly handle feature-level scale imbalance and class imbalance in object detection. Instead of designing a complex approach, a simple and effective multi-layered feature enrichment scheme was proposed that effectively combines deep, intermediate, and shallow features to obtain important semantic and spatial information for small object detection. In addition, a chained parallel pooling was proposed to capture rich background contextual information. A cascaded anchor refinement scheme was introduced to integrate useful multiscale contextual information into Single Shot MultiBox Detector's prediction layers to improve the multiscale detection's distinctiveness. The ultimate goal of the cascaded anchor refinement scheme was to counteract the class imbalance by refining anchors and enriching contextual features to improve regression and classification. The performance of MREFP-Net was evaluated using two benchmark datasets, MSCOCO and PASCAL VOC 07/ 12. For a 300 × 300 input on MS-COCO test-dev, MREFP-Net-ResNet101 achieved a state-of-the-art detection accuracy ???? of 36.6 with single-scale inference strategy and 39.2 ms on RTX 2060 GPU. For a 512 × 512 input on MS-COCO test-dev, MREFP-Net obtained an absolute gain of 2.5%. In particular, the results of MREFP-Net-VGG were benchmarked with 800 × 800 input on MS COCO test-dev: 49.2 ???? with a multiscale inference strategy. For 300 × 300 input, MREFP-Net achieved 82.5% ?????? on VOC07+12+COCO, and for 512 × 512 input, MREFP-Net obtained 84.6% ??????. Finally, feature visualization, object characteristic analysis and false-positive error analysis were performed to highlight the effectiveness of enriched features for small object detection. This study has proven that the proposed MREFP-Net was capable of detecting small objects and learning sensitive features to deal with scale, class imbalances, and appearance complexity across object instances.
format Thesis
author Aziz, Lubna
author_facet Aziz, Lubna
author_sort Aziz, Lubna
title Multi level refinement enriched feature pyramid network for scale and class imbalance in object detection
title_short Multi level refinement enriched feature pyramid network for scale and class imbalance in object detection
title_full Multi level refinement enriched feature pyramid network for scale and class imbalance in object detection
title_fullStr Multi level refinement enriched feature pyramid network for scale and class imbalance in object detection
title_full_unstemmed Multi level refinement enriched feature pyramid network for scale and class imbalance in object detection
title_sort multi level refinement enriched feature pyramid network for scale and class imbalance in object detection
publishDate 2022
url http://eprints.utm.my/id/eprint/101479/1/LubnaAzizPSC2022.pdf.pdf
http://eprints.utm.my/id/eprint/101479/
http://dms.library.utm.my:8080/vital/access/manager/Repository/vital:150788
_version_ 1769842061142392832
score 13.188404