Pre-trained language model with feature reduction and no fine-tuning
Pre-trained language models were proven to achieve excellent results in Natural Language Processing tasks such as Sentiment Analysis. However, the number of sentence embeddings from the base model of Bidirectional Encoder from Transformer (BERT) is 768 for a sentence, and there will be more than mil...
Saved in:
Main Authors: | , |
---|---|
Format: | Conference or Workshop Item |
Published: |
2022
|
Subjects: | |
Online Access: | http://eprints.utm.my/id/eprint/98842/ http://dx.doi.org/10.1007/978-981-19-3923-5_59 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
id |
my.utm.98842 |
---|---|
record_format |
eprints |
spelling |
my.utm.988422023-02-02T09:36:04Z http://eprints.utm.my/id/eprint/98842/ Pre-trained language model with feature reduction and no fine-tuning Kit, Y. H. Mokji, M. TK Electrical engineering. Electronics Nuclear engineering Pre-trained language models were proven to achieve excellent results in Natural Language Processing tasks such as Sentiment Analysis. However, the number of sentence embeddings from the base model of Bidirectional Encoder from Transformer (BERT) is 768 for a sentence, and there will be more than millions of unique numbers when the dataset is huge, leading to the increasing complexity of the system. Thus, this paper presents the feature reduction of the sentence embeddings classification with BERT to decrease the number of features and complexity by using feature reduction algorithm. With 50% fewer features, the experimental results show that the proposed system improves the accuracy by 1%−2% with 89% lesser GPU memory usage. 2022 Conference or Workshop Item PeerReviewed Kit, Y. H. and Mokji, M. (2022) Pre-trained language model with feature reduction and no fine-tuning. In: 3rd International Conference on Control, Instrumentation and Mechatronics Engineering, CIM 2022, 2 March 2022 - 3 March 2022, Virtual, Online. http://dx.doi.org/10.1007/978-981-19-3923-5_59 |
institution |
Universiti Teknologi Malaysia |
building |
UTM Library |
collection |
Institutional Repository |
continent |
Asia |
country |
Malaysia |
content_provider |
Universiti Teknologi Malaysia |
content_source |
UTM Institutional Repository |
url_provider |
http://eprints.utm.my/ |
topic |
TK Electrical engineering. Electronics Nuclear engineering |
spellingShingle |
TK Electrical engineering. Electronics Nuclear engineering Kit, Y. H. Mokji, M. Pre-trained language model with feature reduction and no fine-tuning |
description |
Pre-trained language models were proven to achieve excellent results in Natural Language Processing tasks such as Sentiment Analysis. However, the number of sentence embeddings from the base model of Bidirectional Encoder from Transformer (BERT) is 768 for a sentence, and there will be more than millions of unique numbers when the dataset is huge, leading to the increasing complexity of the system. Thus, this paper presents the feature reduction of the sentence embeddings classification with BERT to decrease the number of features and complexity by using feature reduction algorithm. With 50% fewer features, the experimental results show that the proposed system improves the accuracy by 1%−2% with 89% lesser GPU memory usage. |
format |
Conference or Workshop Item |
author |
Kit, Y. H. Mokji, M. |
author_facet |
Kit, Y. H. Mokji, M. |
author_sort |
Kit, Y. H. |
title |
Pre-trained language model with feature reduction and no fine-tuning |
title_short |
Pre-trained language model with feature reduction and no fine-tuning |
title_full |
Pre-trained language model with feature reduction and no fine-tuning |
title_fullStr |
Pre-trained language model with feature reduction and no fine-tuning |
title_full_unstemmed |
Pre-trained language model with feature reduction and no fine-tuning |
title_sort |
pre-trained language model with feature reduction and no fine-tuning |
publishDate |
2022 |
url |
http://eprints.utm.my/id/eprint/98842/ http://dx.doi.org/10.1007/978-981-19-3923-5_59 |
_version_ |
1758578027014389760 |
score |
13.209306 |