A scheme of pairwise feature combinations to improve sentiment classification using book review dataset

Sentiment Analysis is a Natural Language Processing (NLP) domain related to the identification or extraction of user sentiments or opinions from written language. Although the approaches to achieve the goals may vary, Machine Learning (ML) methods are gradually becoming the preferred method because...

Full description

Saved in:
Bibliographic Details
Main Authors: Abubakar, Haisal Dauda, Huspi, Sharin Hazlin, Mahmood Umar, Mahmood Umar
Format: Article
Language:English
Published: Computer Science and Information System 2022
Subjects:
Online Access:http://eprints.utm.my/108822/1/SharinHazlinHuspi2022_ASchemeofPairwiseFeatureCombinations.pdf
http://eprints.utm.my/108822/
http://dx.doi.org/10.11113/ijic.v12n1.344
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.utm.108822
record_format eprints
spelling my.utm.1088222024-12-09T07:46:28Z http://eprints.utm.my/108822/ A scheme of pairwise feature combinations to improve sentiment classification using book review dataset Abubakar, Haisal Dauda Huspi, Sharin Hazlin Mahmood Umar, Mahmood Umar QA75 Electronic computers. Computer science Sentiment Analysis is a Natural Language Processing (NLP) domain related to the identification or extraction of user sentiments or opinions from written language. Although the approaches to achieve the goals may vary, Machine Learning (ML) methods are gradually becoming the preferred method because of their ability to automatically draw useful insight from data regardless of their complexity. However, an important prerequisite for most ML algorithms to learn from text data is to encode them into numerical vectors. Popular approaches to this include word level representation methods TF-IDF, distributed word representations (word2vec) and distributed document representations (doc2vec). Each of these methods has demonstrated remarkable success in representing the encoded text, however we found that no method has been set to be excellence in all tasks. Motivated by this challenge, an improved scheme of pairwise fusion are proposed for sentiment classification of book reviews. In the experimental findings, Artificial Neural Networks (ANN) and Logistic Regression (LR) classifiers showed that the proposed scheme improved the performance compared to the single method vectorization method. We see that TF-IDF-word2vec performed best among other methods with a mean accuracy of 91.0% (ANN) and 92.5% (LR); showed an improvement of 0.7% and 0.2% respectively over TF-IDF which is the best single vector method. Thus, the proposed method can used as a compact alternative to the popular bag-of-n-gram models as it captures contextual information of encoded document with a less sparse data. Computer Science and Information System 2022 Article PeerReviewed application/pdf en http://eprints.utm.my/108822/1/SharinHazlinHuspi2022_ASchemeofPairwiseFeatureCombinations.pdf Abubakar, Haisal Dauda and Huspi, Sharin Hazlin and Mahmood Umar, Mahmood Umar (2022) A scheme of pairwise feature combinations to improve sentiment classification using book review dataset. International Journal of Innovative Computing, 12 (1). pp. 25-33. ISSN 2180-4370 http://dx.doi.org/10.11113/ijic.v12n1.344 DOI : 10.11113/ijic.v12n1.344
institution Universiti Teknologi Malaysia
building UTM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Teknologi Malaysia
content_source UTM Institutional Repository
url_provider http://eprints.utm.my/
language English
topic QA75 Electronic computers. Computer science
spellingShingle QA75 Electronic computers. Computer science
Abubakar, Haisal Dauda
Huspi, Sharin Hazlin
Mahmood Umar, Mahmood Umar
A scheme of pairwise feature combinations to improve sentiment classification using book review dataset
description Sentiment Analysis is a Natural Language Processing (NLP) domain related to the identification or extraction of user sentiments or opinions from written language. Although the approaches to achieve the goals may vary, Machine Learning (ML) methods are gradually becoming the preferred method because of their ability to automatically draw useful insight from data regardless of their complexity. However, an important prerequisite for most ML algorithms to learn from text data is to encode them into numerical vectors. Popular approaches to this include word level representation methods TF-IDF, distributed word representations (word2vec) and distributed document representations (doc2vec). Each of these methods has demonstrated remarkable success in representing the encoded text, however we found that no method has been set to be excellence in all tasks. Motivated by this challenge, an improved scheme of pairwise fusion are proposed for sentiment classification of book reviews. In the experimental findings, Artificial Neural Networks (ANN) and Logistic Regression (LR) classifiers showed that the proposed scheme improved the performance compared to the single method vectorization method. We see that TF-IDF-word2vec performed best among other methods with a mean accuracy of 91.0% (ANN) and 92.5% (LR); showed an improvement of 0.7% and 0.2% respectively over TF-IDF which is the best single vector method. Thus, the proposed method can used as a compact alternative to the popular bag-of-n-gram models as it captures contextual information of encoded document with a less sparse data.
format Article
author Abubakar, Haisal Dauda
Huspi, Sharin Hazlin
Mahmood Umar, Mahmood Umar
author_facet Abubakar, Haisal Dauda
Huspi, Sharin Hazlin
Mahmood Umar, Mahmood Umar
author_sort Abubakar, Haisal Dauda
title A scheme of pairwise feature combinations to improve sentiment classification using book review dataset
title_short A scheme of pairwise feature combinations to improve sentiment classification using book review dataset
title_full A scheme of pairwise feature combinations to improve sentiment classification using book review dataset
title_fullStr A scheme of pairwise feature combinations to improve sentiment classification using book review dataset
title_full_unstemmed A scheme of pairwise feature combinations to improve sentiment classification using book review dataset
title_sort scheme of pairwise feature combinations to improve sentiment classification using book review dataset
publisher Computer Science and Information System
publishDate 2022
url http://eprints.utm.my/108822/1/SharinHazlinHuspi2022_ASchemeofPairwiseFeatureCombinations.pdf
http://eprints.utm.my/108822/
http://dx.doi.org/10.11113/ijic.v12n1.344
_version_ 1818834053992808448
score 13.23648