Multi-level refinement enriched feature pyramid network for object detection

Class Imbalance and scales imbalance are common in object detection. A class imbalance occurs due to insufficient inequality between the number of instances with respect to different classes, while an imbalance in scale occurs when object have different scales and a different number of examples of d...

Full description

Saved in:
Bibliographic Details
Main Authors: Aziz, Lubna, Salam F. C., Md. Sah, Ayub, Sara
Format: Article
Published: Elsevier B.V. 2021
Subjects:
Online Access:http://eprints.utm.my/id/eprint/94086/
http://dx.doi.org/10.1016/j.imavis.2021.104287
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.utm.94086
record_format eprints
spelling my.utm.940862022-02-28T13:31:41Z http://eprints.utm.my/id/eprint/94086/ Multi-level refinement enriched feature pyramid network for object detection Aziz, Lubna Salam F. C., Md. Sah Ayub, Sara QA75 Electronic computers. Computer science Class Imbalance and scales imbalance are common in object detection. A class imbalance occurs due to insufficient inequality between the number of instances with respect to different classes, while an imbalance in scale occurs when object have different scales and a different number of examples of different scales. In order to solve the problem of scale variance (scale imbalance) and class imbalance together, we propose a simple and effective feature enhancement scheme that explicitly uses all information of a multi-level structure to generate a multilevel contextual features pyramid with multiple scales. We also introduce a cascaded refinement scheme that incorporates multi-scale contextual features into the Single Shot Detector (SSD) predictive layers to improve their distinctiveness for multi-scale detection. A stack of multi-scale contextual feature modules is used in a feature enhancement scheme to merge the multi-level and multi-scale features. Then we collect the equivalent scale features over the Multi-layer Feature Fusion (MLFF) unit to construct a feature pyramid in which each feature map is made up of layers from multiple levels. More robustness and contextual information are integrated into the pyramid through chain parallel pooling operation. To improve classification and regression, a cascaded refinement scheme is proposed that effectively captures a large amount of contextual information and refines the anchors to solve the class imbalance problem. The experiments are carried out on two benchmarks datasets: MS COCO and PASCAL VOC 07/12. Our proposed approach achieves state-of-the-art accuracy with an AP of 40.6 in the case of multi-scale inference on MS COCO Test-dev (input size 320 × 320). For 512 × 512 input on the MS COCO Test-dev, our approach leads in an absolute gain in precision of 1.8% compared to the best reported results of single-stage detector (AP: 45.7). Elsevier B.V. 2021-11 Article PeerReviewed Aziz, Lubna and Salam F. C., Md. Sah and Ayub, Sara (2021) Multi-level refinement enriched feature pyramid network for object detection. Image and Vision Computing, 115 . ISSN 0262-8856 http://dx.doi.org/10.1016/j.imavis.2021.104287 DOI:10.1016/j.imavis.2021.104287
institution Universiti Teknologi Malaysia
building UTM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Teknologi Malaysia
content_source UTM Institutional Repository
url_provider http://eprints.utm.my/
topic QA75 Electronic computers. Computer science
spellingShingle QA75 Electronic computers. Computer science
Aziz, Lubna
Salam F. C., Md. Sah
Ayub, Sara
Multi-level refinement enriched feature pyramid network for object detection
description Class Imbalance and scales imbalance are common in object detection. A class imbalance occurs due to insufficient inequality between the number of instances with respect to different classes, while an imbalance in scale occurs when object have different scales and a different number of examples of different scales. In order to solve the problem of scale variance (scale imbalance) and class imbalance together, we propose a simple and effective feature enhancement scheme that explicitly uses all information of a multi-level structure to generate a multilevel contextual features pyramid with multiple scales. We also introduce a cascaded refinement scheme that incorporates multi-scale contextual features into the Single Shot Detector (SSD) predictive layers to improve their distinctiveness for multi-scale detection. A stack of multi-scale contextual feature modules is used in a feature enhancement scheme to merge the multi-level and multi-scale features. Then we collect the equivalent scale features over the Multi-layer Feature Fusion (MLFF) unit to construct a feature pyramid in which each feature map is made up of layers from multiple levels. More robustness and contextual information are integrated into the pyramid through chain parallel pooling operation. To improve classification and regression, a cascaded refinement scheme is proposed that effectively captures a large amount of contextual information and refines the anchors to solve the class imbalance problem. The experiments are carried out on two benchmarks datasets: MS COCO and PASCAL VOC 07/12. Our proposed approach achieves state-of-the-art accuracy with an AP of 40.6 in the case of multi-scale inference on MS COCO Test-dev (input size 320 × 320). For 512 × 512 input on the MS COCO Test-dev, our approach leads in an absolute gain in precision of 1.8% compared to the best reported results of single-stage detector (AP: 45.7).
format Article
author Aziz, Lubna
Salam F. C., Md. Sah
Ayub, Sara
author_facet Aziz, Lubna
Salam F. C., Md. Sah
Ayub, Sara
author_sort Aziz, Lubna
title Multi-level refinement enriched feature pyramid network for object detection
title_short Multi-level refinement enriched feature pyramid network for object detection
title_full Multi-level refinement enriched feature pyramid network for object detection
title_fullStr Multi-level refinement enriched feature pyramid network for object detection
title_full_unstemmed Multi-level refinement enriched feature pyramid network for object detection
title_sort multi-level refinement enriched feature pyramid network for object detection
publisher Elsevier B.V.
publishDate 2021
url http://eprints.utm.my/id/eprint/94086/
http://dx.doi.org/10.1016/j.imavis.2021.104287
_version_ 1726791478868443136
score 13.18916