Multi-level refinement enriched feature pyramid network for object detection
Class Imbalance and scales imbalance are common in object detection. A class imbalance occurs due to insufficient inequality between the number of instances with respect to different classes, while an imbalance in scale occurs when object have different scales and a different number of examples of d...
Saved in:
Main Authors: | , , |
---|---|
Format: | Article |
Published: |
Elsevier B.V.
2021
|
Subjects: | |
Online Access: | http://eprints.utm.my/id/eprint/94086/ http://dx.doi.org/10.1016/j.imavis.2021.104287 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
id |
my.utm.94086 |
---|---|
record_format |
eprints |
spelling |
my.utm.940862022-02-28T13:31:41Z http://eprints.utm.my/id/eprint/94086/ Multi-level refinement enriched feature pyramid network for object detection Aziz, Lubna Salam F. C., Md. Sah Ayub, Sara QA75 Electronic computers. Computer science Class Imbalance and scales imbalance are common in object detection. A class imbalance occurs due to insufficient inequality between the number of instances with respect to different classes, while an imbalance in scale occurs when object have different scales and a different number of examples of different scales. In order to solve the problem of scale variance (scale imbalance) and class imbalance together, we propose a simple and effective feature enhancement scheme that explicitly uses all information of a multi-level structure to generate a multilevel contextual features pyramid with multiple scales. We also introduce a cascaded refinement scheme that incorporates multi-scale contextual features into the Single Shot Detector (SSD) predictive layers to improve their distinctiveness for multi-scale detection. A stack of multi-scale contextual feature modules is used in a feature enhancement scheme to merge the multi-level and multi-scale features. Then we collect the equivalent scale features over the Multi-layer Feature Fusion (MLFF) unit to construct a feature pyramid in which each feature map is made up of layers from multiple levels. More robustness and contextual information are integrated into the pyramid through chain parallel pooling operation. To improve classification and regression, a cascaded refinement scheme is proposed that effectively captures a large amount of contextual information and refines the anchors to solve the class imbalance problem. The experiments are carried out on two benchmarks datasets: MS COCO and PASCAL VOC 07/12. Our proposed approach achieves state-of-the-art accuracy with an AP of 40.6 in the case of multi-scale inference on MS COCO Test-dev (input size 320 × 320). For 512 × 512 input on the MS COCO Test-dev, our approach leads in an absolute gain in precision of 1.8% compared to the best reported results of single-stage detector (AP: 45.7). Elsevier B.V. 2021-11 Article PeerReviewed Aziz, Lubna and Salam F. C., Md. Sah and Ayub, Sara (2021) Multi-level refinement enriched feature pyramid network for object detection. Image and Vision Computing, 115 . ISSN 0262-8856 http://dx.doi.org/10.1016/j.imavis.2021.104287 DOI:10.1016/j.imavis.2021.104287 |
institution |
Universiti Teknologi Malaysia |
building |
UTM Library |
collection |
Institutional Repository |
continent |
Asia |
country |
Malaysia |
content_provider |
Universiti Teknologi Malaysia |
content_source |
UTM Institutional Repository |
url_provider |
http://eprints.utm.my/ |
topic |
QA75 Electronic computers. Computer science |
spellingShingle |
QA75 Electronic computers. Computer science Aziz, Lubna Salam F. C., Md. Sah Ayub, Sara Multi-level refinement enriched feature pyramid network for object detection |
description |
Class Imbalance and scales imbalance are common in object detection. A class imbalance occurs due to insufficient inequality between the number of instances with respect to different classes, while an imbalance in scale occurs when object have different scales and a different number of examples of different scales. In order to solve the problem of scale variance (scale imbalance) and class imbalance together, we propose a simple and effective feature enhancement scheme that explicitly uses all information of a multi-level structure to generate a multilevel contextual features pyramid with multiple scales. We also introduce a cascaded refinement scheme that incorporates multi-scale contextual features into the Single Shot Detector (SSD) predictive layers to improve their distinctiveness for multi-scale detection. A stack of multi-scale contextual feature modules is used in a feature enhancement scheme to merge the multi-level and multi-scale features. Then we collect the equivalent scale features over the Multi-layer Feature Fusion (MLFF) unit to construct a feature pyramid in which each feature map is made up of layers from multiple levels. More robustness and contextual information are integrated into the pyramid through chain parallel pooling operation. To improve classification and regression, a cascaded refinement scheme is proposed that effectively captures a large amount of contextual information and refines the anchors to solve the class imbalance problem. The experiments are carried out on two benchmarks datasets: MS COCO and PASCAL VOC 07/12. Our proposed approach achieves state-of-the-art accuracy with an AP of 40.6 in the case of multi-scale inference on MS COCO Test-dev (input size 320 × 320). For 512 × 512 input on the MS COCO Test-dev, our approach leads in an absolute gain in precision of 1.8% compared to the best reported results of single-stage detector (AP: 45.7). |
format |
Article |
author |
Aziz, Lubna Salam F. C., Md. Sah Ayub, Sara |
author_facet |
Aziz, Lubna Salam F. C., Md. Sah Ayub, Sara |
author_sort |
Aziz, Lubna |
title |
Multi-level refinement enriched feature pyramid network for object detection |
title_short |
Multi-level refinement enriched feature pyramid network for object detection |
title_full |
Multi-level refinement enriched feature pyramid network for object detection |
title_fullStr |
Multi-level refinement enriched feature pyramid network for object detection |
title_full_unstemmed |
Multi-level refinement enriched feature pyramid network for object detection |
title_sort |
multi-level refinement enriched feature pyramid network for object detection |
publisher |
Elsevier B.V. |
publishDate |
2021 |
url |
http://eprints.utm.my/id/eprint/94086/ http://dx.doi.org/10.1016/j.imavis.2021.104287 |
_version_ |
1726791478868443136 |
score |
13.18916 |