Staff View: Efficient label-free pruning and retraining for Text-VQA Transformers

Efficient label-free pruning and retraining for Text-VQA Transformers

Recent advancements in Scene Text Visual Question Answering (Text-VQA) employ autoregressive Transformers, showing improved performance with larger models and pre -training datasets. Although various pruning frameworks exist to simplify Transformers, many are integrated into the time-consuming train...

Full description

Saved in:

Bibliographic Details
Main Authors:	Poh, Soon Chang, Chan, Chee Seng, Lim, Chee Kau
Format:	Article
Published:	Elsevier 2024
Subjects:	QA75 Electronic computers. Computer science
Online Access:	http://eprints.um.edu.my/45137/ https://doi.org/10.1016/j.patrec.2024.04.024
Tags:	Add Tag No Tags, Be the first to tag this record!

id	my.um.eprints.45137
record_format	eprints
spelling	my.um.eprints.451372024-09-19T01:40:56Z http://eprints.um.edu.my/45137/ Efficient label-free pruning and retraining for Text-VQA Transformers Poh, Soon Chang Chan, Chee Seng Lim, Chee Kau QA75 Electronic computers. Computer science Recent advancements in Scene Text Visual Question Answering (Text-VQA) employ autoregressive Transformers, showing improved performance with larger models and pre -training datasets. Although various pruning frameworks exist to simplify Transformers, many are integrated into the time-consuming training process. Researchers have recently explored post -training pruning techniques, which separate pruning from training and reduce time consumption. Some methods use gradient -based importance scores that rely on labeled data, while others offer retraining -free algorithms that quickly enhance pruned model accuracy. This paper proposes a novel gradient -based importance score that only necessitates raw, unlabeled data for post -training structured autoregressive Transformer pruning. Additionally, we introduce a Retraining Strategy (ReSt) for efficient performance restoration of pruned models of arbitrary sizes. We evaluate our approach on TextVQA and ST-VQA datasets using TAP, TAP dagger dagger and SaL double dagger- Base where all utilize autoregressive Transformers. On TAP and TAP dagger dagger , our pruning approach achieves up to 60% reduction in size with less than a 2.4% accuracy drop and the proposed ReSt retraining approach takes only 3 to 34 min, comparable to existing retraining -free techniques. On SaL double dagger- Base, the proposed method achieves up to 50% parameter reduction with less than 2.9% accuracy drop requiring only 1.19 h of retraining using the proposed ReSt approach. The code is publicly accessible at https://github.com/soonchangAI/LFPR. Elsevier 2024-07 Article PeerReviewed Poh, Soon Chang and Chan, Chee Seng and Lim, Chee Kau (2024) Efficient label-free pruning and retraining for Text-VQA Transformers. Pattern Recognition Letters, 183. pp. 1-8. ISSN 0167-8655, DOI https://doi.org/10.1016/j.patrec.2024.04.024 <https://doi.org/10.1016/j.patrec.2024.04.024>. https://doi.org/10.1016/j.patrec.2024.04.024 10.1016/j.patrec.2024.04.024
institution	Universiti Malaya
building	UM Library
collection	Institutional Repository
continent	Asia
country	Malaysia
content_provider	Universiti Malaya
content_source	UM Research Repository
url_provider	http://eprints.um.edu.my/
topic	QA75 Electronic computers. Computer science
spellingShingle	QA75 Electronic computers. Computer science Poh, Soon Chang Chan, Chee Seng Lim, Chee Kau Efficient label-free pruning and retraining for Text-VQA Transformers
description	Recent advancements in Scene Text Visual Question Answering (Text-VQA) employ autoregressive Transformers, showing improved performance with larger models and pre -training datasets. Although various pruning frameworks exist to simplify Transformers, many are integrated into the time-consuming training process. Researchers have recently explored post -training pruning techniques, which separate pruning from training and reduce time consumption. Some methods use gradient -based importance scores that rely on labeled data, while others offer retraining -free algorithms that quickly enhance pruned model accuracy. This paper proposes a novel gradient -based importance score that only necessitates raw, unlabeled data for post -training structured autoregressive Transformer pruning. Additionally, we introduce a Retraining Strategy (ReSt) for efficient performance restoration of pruned models of arbitrary sizes. We evaluate our approach on TextVQA and ST-VQA datasets using TAP, TAP dagger dagger and SaL double dagger- Base where all utilize autoregressive Transformers. On TAP and TAP dagger dagger , our pruning approach achieves up to 60% reduction in size with less than a 2.4% accuracy drop and the proposed ReSt retraining approach takes only 3 to 34 min, comparable to existing retraining -free techniques. On SaL double dagger- Base, the proposed method achieves up to 50% parameter reduction with less than 2.9% accuracy drop requiring only 1.19 h of retraining using the proposed ReSt approach. The code is publicly accessible at https://github.com/soonchangAI/LFPR.
format	Article
author	Poh, Soon Chang Chan, Chee Seng Lim, Chee Kau
author_facet	Poh, Soon Chang Chan, Chee Seng Lim, Chee Kau
author_sort	Poh, Soon Chang
title	Efficient label-free pruning and retraining for Text-VQA Transformers
title_short	Efficient label-free pruning and retraining for Text-VQA Transformers
title_full	Efficient label-free pruning and retraining for Text-VQA Transformers
title_fullStr	Efficient label-free pruning and retraining for Text-VQA Transformers
title_full_unstemmed	Efficient label-free pruning and retraining for Text-VQA Transformers
title_sort	efficient label-free pruning and retraining for text-vqa transformers
publisher	Elsevier
publishDate	2024
url	http://eprints.um.edu.my/45137/ https://doi.org/10.1016/j.patrec.2024.04.024
_version_	1811682092384256000
score	13.209306

Efficient label-free pruning and retraining for Text-VQA Transformers

Similar Items