Staff View: ACORT: A compact object relation transformer for parameter efficient image captioning

ACORT: A compact object relation transformer for parameter efficient image captioning

Recent research that applies Transformer-based architectures to image captioning has resulted in stateof-the-art image captioning performance, capitalising on the success of Transformers on natural language tasks. Unfortunately, though these models work well, one major flaw is their large model size...

Full description

Saved in:

Bibliographic Details
Main Authors:	Tan, Jia Huei, Tan, Ying Hua, Chan, Chee Seng, Chuah, Joon Huang
Format:	Article
Published:	Elsevier 2022
Subjects:	QA75 Electronic computers. Computer science
Online Access:	http://eprints.um.edu.my/32731/
Tags:	Add Tag No Tags, Be the first to tag this record!

id	my.um.eprints.32731
record_format	eprints
spelling	my.um.eprints.327312022-08-11T00:45:34Z http://eprints.um.edu.my/32731/ ACORT: A compact object relation transformer for parameter efficient image captioning Tan, Jia Huei Tan, Ying Hua Chan, Chee Seng Chuah, Joon Huang QA75 Electronic computers. Computer science Recent research that applies Transformer-based architectures to image captioning has resulted in stateof-the-art image captioning performance, capitalising on the success of Transformers on natural language tasks. Unfortunately, though these models work well, one major flaw is their large model sizes. To this end, we present three parameter reduction methods for image captioning Transformers: Radix Encoding, cross-layer parameter sharing, and attention parameter sharing. By combining these methods, our proposed ACORT models have 3.7x to 21.6x fewer parameters than the baseline model without compromising test performance. Results on the MS-COCO dataset demonstrate that our ACORT models are competitive against baselines and SOTA approaches, with CIDEr score P126. Finally, we present qualitative results and ablation studies to demonstrate the efficacy of the proposed changes further. Code and pre-trained models are publicly available at https://github.com/jiahuei/sparse-image-captioning. (c) 2022 Published by Elsevier B.V. Elsevier 2022-04-14 Article PeerReviewed Tan, Jia Huei and Tan, Ying Hua and Chan, Chee Seng and Chuah, Joon Huang (2022) ACORT: A compact object relation transformer for parameter efficient image captioning. Neurocomputing, 482. pp. 60-72. ISSN 0925-2312, DOI https://doi.org/10.1016/j.neucom.2022.01.081 <https://doi.org/10.1016/j.neucom.2022.01.081>. 10.1016/j.neucom.2022.01.081
institution	Universiti Malaya
building	UM Library
collection	Institutional Repository
continent	Asia
country	Malaysia
content_provider	Universiti Malaya
content_source	UM Research Repository
url_provider	http://eprints.um.edu.my/
topic	QA75 Electronic computers. Computer science
spellingShingle	QA75 Electronic computers. Computer science Tan, Jia Huei Tan, Ying Hua Chan, Chee Seng Chuah, Joon Huang ACORT: A compact object relation transformer for parameter efficient image captioning
description	Recent research that applies Transformer-based architectures to image captioning has resulted in stateof-the-art image captioning performance, capitalising on the success of Transformers on natural language tasks. Unfortunately, though these models work well, one major flaw is their large model sizes. To this end, we present three parameter reduction methods for image captioning Transformers: Radix Encoding, cross-layer parameter sharing, and attention parameter sharing. By combining these methods, our proposed ACORT models have 3.7x to 21.6x fewer parameters than the baseline model without compromising test performance. Results on the MS-COCO dataset demonstrate that our ACORT models are competitive against baselines and SOTA approaches, with CIDEr score P126. Finally, we present qualitative results and ablation studies to demonstrate the efficacy of the proposed changes further. Code and pre-trained models are publicly available at https://github.com/jiahuei/sparse-image-captioning. (c) 2022 Published by Elsevier B.V.
format	Article
author	Tan, Jia Huei Tan, Ying Hua Chan, Chee Seng Chuah, Joon Huang
author_facet	Tan, Jia Huei Tan, Ying Hua Chan, Chee Seng Chuah, Joon Huang
author_sort	Tan, Jia Huei
title	ACORT: A compact object relation transformer for parameter efficient image captioning
title_short	ACORT: A compact object relation transformer for parameter efficient image captioning
title_full	ACORT: A compact object relation transformer for parameter efficient image captioning
title_fullStr	ACORT: A compact object relation transformer for parameter efficient image captioning
title_full_unstemmed	ACORT: A compact object relation transformer for parameter efficient image captioning
title_sort	acort: a compact object relation transformer for parameter efficient image captioning
publisher	Elsevier
publishDate	2022
url	http://eprints.um.edu.my/32731/
_version_	1744649147617640448
score	13.18916

ACORT: A compact object relation transformer for parameter efficient image captioning

Similar Items