Pre-trained language model with feature reduction and no fine-tuning

Pre-trained language models were proven to achieve excellent results in Natural Language Processing tasks such as Sentiment Analysis. However, the number of sentence embeddings from the base model of Bidirectional Encoder from Transformer (BERT) is 768 for a sentence, and there will be more than mil...

Full description

Saved in:
Bibliographic Details
Main Authors: Kit, Y. H., Mokji, M.
Format: Conference or Workshop Item
Published: 2022
Subjects:
Online Access:http://eprints.utm.my/id/eprint/98842/
http://dx.doi.org/10.1007/978-981-19-3923-5_59
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.utm.98842
record_format eprints
spelling my.utm.988422023-02-02T09:36:04Z http://eprints.utm.my/id/eprint/98842/ Pre-trained language model with feature reduction and no fine-tuning Kit, Y. H. Mokji, M. TK Electrical engineering. Electronics Nuclear engineering Pre-trained language models were proven to achieve excellent results in Natural Language Processing tasks such as Sentiment Analysis. However, the number of sentence embeddings from the base model of Bidirectional Encoder from Transformer (BERT) is 768 for a sentence, and there will be more than millions of unique numbers when the dataset is huge, leading to the increasing complexity of the system. Thus, this paper presents the feature reduction of the sentence embeddings classification with BERT to decrease the number of features and complexity by using feature reduction algorithm. With 50% fewer features, the experimental results show that the proposed system improves the accuracy by 1%−2% with 89% lesser GPU memory usage. 2022 Conference or Workshop Item PeerReviewed Kit, Y. H. and Mokji, M. (2022) Pre-trained language model with feature reduction and no fine-tuning. In: 3rd International Conference on Control, Instrumentation and Mechatronics Engineering, CIM 2022, 2 March 2022 - 3 March 2022, Virtual, Online. http://dx.doi.org/10.1007/978-981-19-3923-5_59
institution Universiti Teknologi Malaysia
building UTM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Teknologi Malaysia
content_source UTM Institutional Repository
url_provider http://eprints.utm.my/
topic TK Electrical engineering. Electronics Nuclear engineering
spellingShingle TK Electrical engineering. Electronics Nuclear engineering
Kit, Y. H.
Mokji, M.
Pre-trained language model with feature reduction and no fine-tuning
description Pre-trained language models were proven to achieve excellent results in Natural Language Processing tasks such as Sentiment Analysis. However, the number of sentence embeddings from the base model of Bidirectional Encoder from Transformer (BERT) is 768 for a sentence, and there will be more than millions of unique numbers when the dataset is huge, leading to the increasing complexity of the system. Thus, this paper presents the feature reduction of the sentence embeddings classification with BERT to decrease the number of features and complexity by using feature reduction algorithm. With 50% fewer features, the experimental results show that the proposed system improves the accuracy by 1%−2% with 89% lesser GPU memory usage.
format Conference or Workshop Item
author Kit, Y. H.
Mokji, M.
author_facet Kit, Y. H.
Mokji, M.
author_sort Kit, Y. H.
title Pre-trained language model with feature reduction and no fine-tuning
title_short Pre-trained language model with feature reduction and no fine-tuning
title_full Pre-trained language model with feature reduction and no fine-tuning
title_fullStr Pre-trained language model with feature reduction and no fine-tuning
title_full_unstemmed Pre-trained language model with feature reduction and no fine-tuning
title_sort pre-trained language model with feature reduction and no fine-tuning
publishDate 2022
url http://eprints.utm.my/id/eprint/98842/
http://dx.doi.org/10.1007/978-981-19-3923-5_59
_version_ 1758578027014389760
score 13.209306