Staff View: Context-based feature technique for sarcasm identification in benchmark datasets using deep learning and bert model

Context-based feature technique for sarcasm identification in benchmark datasets using deep learning and bert model

Sarcasm is a complicated linguistic term commonly found in e-commerce and social media sites. Failure to identify sarcastic utterances in Natural Language Processing applications such as sentiment analysis and opinion mining will confuse classification algorithms and generate false results. Several...

Full description

Saved in:

Bibliographic Details
Main Authors:	Eke, Christopher Ifeanyi, Norman, Azah Anir, Shuib, Liyana
Format:	Article
Published:	Institute of Electrical and Electronics Engineers 2021
Subjects:	QA75 Electronic computers. Computer science TK Electrical engineering. Electronics Nuclear engineering
Online Access:	http://eprints.um.edu.my/26991/
Tags:	Add Tag No Tags, Be the first to tag this record!

id	my.um.eprints.26991
record_format	eprints
spelling	my.um.eprints.269912022-04-05T07:34:50Z http://eprints.um.edu.my/26991/ Context-based feature technique for sarcasm identification in benchmark datasets using deep learning and bert model Eke, Christopher Ifeanyi Norman, Azah Anir Shuib, Liyana QA75 Electronic computers. Computer science TK Electrical engineering. Electronics Nuclear engineering Sarcasm is a complicated linguistic term commonly found in e-commerce and social media sites. Failure to identify sarcastic utterances in Natural Language Processing applications such as sentiment analysis and opinion mining will confuse classification algorithms and generate false results. Several studies on sarcasm detection have utilised different learning algorithms. However, most of these learning models have always focused on the contents of expression only, leaving the contextual information in isolation. As a result, they failed to capture the contextual information in the sarcastic expression. Secondly, many deep learning methods in NLP uses a word embedding learning algorithm as a standard approach for feature vector representation, which ignores the sentiment polarity of the words in the sarcastic expression. This study proposes a context-based feature technique for sarcasm Identification using the deep learning model, BERT model, and conventional machine learning to address the issues mentioned above. Two Twitter and Internet Argument Corpus, version two (IAC-v2) benchmark datasets were utilised for the classification using the three learning models. The first model uses embedding-based representation via deep learning model with bidirectional long short term memory (Bi-LSTM), a variant of Recurrent Neural Network (RNN), by applying Global Vector representation (GloVe) for the construction of word embedding and context learning. The second model is based on Transformer using a pre-trained Bidirectional Encoder representation and Transformer (BERT). In contrast, the third model is based on feature fusion that comprised BERT feature, sentiment related, syntactic, and GloVe embedding feature with conventional machine learning. The effectiveness of this technique is tested with various evaluation experiments. However, the technique's evaluation on two Twitter benchmark datasets attained 98.5% and 98.0% highest precision, respectively. The IAC-v2 dataset, on the other hand, achieved the highest precision of 81.2%, which shows the significance of the proposed technique over the baseline approaches for sarcasm analysis. Institute of Electrical and Electronics Engineers 2021 Article PeerReviewed Eke, Christopher Ifeanyi and Norman, Azah Anir and Shuib, Liyana (2021) Context-based feature technique for sarcasm identification in benchmark datasets using deep learning and bert model. IEEE Access, 9. pp. 48501-48518. ISSN 2169-3536, DOI https://doi.org/10.1109/ACCESS.2021.3068323 <https://doi.org/10.1109/ACCESS.2021.3068323>. 10.1109/ACCESS.2021.3068323
institution	Universiti Malaya
building	UM Library
collection	Institutional Repository
continent	Asia
country	Malaysia
content_provider	Universiti Malaya
content_source	UM Research Repository
url_provider	http://eprints.um.edu.my/
topic	QA75 Electronic computers. Computer science TK Electrical engineering. Electronics Nuclear engineering
spellingShingle	QA75 Electronic computers. Computer science TK Electrical engineering. Electronics Nuclear engineering Eke, Christopher Ifeanyi Norman, Azah Anir Shuib, Liyana Context-based feature technique for sarcasm identification in benchmark datasets using deep learning and bert model
description	Sarcasm is a complicated linguistic term commonly found in e-commerce and social media sites. Failure to identify sarcastic utterances in Natural Language Processing applications such as sentiment analysis and opinion mining will confuse classification algorithms and generate false results. Several studies on sarcasm detection have utilised different learning algorithms. However, most of these learning models have always focused on the contents of expression only, leaving the contextual information in isolation. As a result, they failed to capture the contextual information in the sarcastic expression. Secondly, many deep learning methods in NLP uses a word embedding learning algorithm as a standard approach for feature vector representation, which ignores the sentiment polarity of the words in the sarcastic expression. This study proposes a context-based feature technique for sarcasm Identification using the deep learning model, BERT model, and conventional machine learning to address the issues mentioned above. Two Twitter and Internet Argument Corpus, version two (IAC-v2) benchmark datasets were utilised for the classification using the three learning models. The first model uses embedding-based representation via deep learning model with bidirectional long short term memory (Bi-LSTM), a variant of Recurrent Neural Network (RNN), by applying Global Vector representation (GloVe) for the construction of word embedding and context learning. The second model is based on Transformer using a pre-trained Bidirectional Encoder representation and Transformer (BERT). In contrast, the third model is based on feature fusion that comprised BERT feature, sentiment related, syntactic, and GloVe embedding feature with conventional machine learning. The effectiveness of this technique is tested with various evaluation experiments. However, the technique's evaluation on two Twitter benchmark datasets attained 98.5% and 98.0% highest precision, respectively. The IAC-v2 dataset, on the other hand, achieved the highest precision of 81.2%, which shows the significance of the proposed technique over the baseline approaches for sarcasm analysis.
format	Article
author	Eke, Christopher Ifeanyi Norman, Azah Anir Shuib, Liyana
author_facet	Eke, Christopher Ifeanyi Norman, Azah Anir Shuib, Liyana
author_sort	Eke, Christopher Ifeanyi
title	Context-based feature technique for sarcasm identification in benchmark datasets using deep learning and bert model
title_short	Context-based feature technique for sarcasm identification in benchmark datasets using deep learning and bert model
title_full	Context-based feature technique for sarcasm identification in benchmark datasets using deep learning and bert model
title_fullStr	Context-based feature technique for sarcasm identification in benchmark datasets using deep learning and bert model
title_full_unstemmed	Context-based feature technique for sarcasm identification in benchmark datasets using deep learning and bert model
title_sort	context-based feature technique for sarcasm identification in benchmark datasets using deep learning and bert model
publisher	Institute of Electrical and Electronics Engineers
publishDate	2021
url	http://eprints.um.edu.my/26991/
_version_	1735409485221986304
score	13.211869

Context-based feature technique for sarcasm identification in benchmark datasets using deep learning and bert model

Similar Items