Senti2Vec: an effective feature extraction technique for sentiment analysis based on Word2Vec

The discovery of an active feature extraction technique has been the focus of many researchers to improve the performance of classification methods, such as for sentiment analysis. Many of them have shown interest in using word embeddings especially Word2Vec as the features for text classification t...

Full description

Saved in:
Bibliographic Details
Main Authors: M.Alshari, Eissa, Azman, Azreen, Doraisamy, Shyamala, Mustapha, Norwati, Alksher, Mostafa
Format: Article
Published: University of Malaya * Faculty of Computer Science and Information Technology 2020
Online Access:http://psasir.upm.edu.my/id/eprint/85796/
https://ejournal.um.edu.my/index.php/MJCS/article/view/25280
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.upm.eprints.85796
record_format eprints
spelling my.upm.eprints.857962023-09-07T00:24:04Z http://psasir.upm.edu.my/id/eprint/85796/ Senti2Vec: an effective feature extraction technique for sentiment analysis based on Word2Vec M.Alshari, Eissa Azman, Azreen Doraisamy, Shyamala Mustapha, Norwati Alksher, Mostafa The discovery of an active feature extraction technique has been the focus of many researchers to improve the performance of classification methods, such as for sentiment analysis. Many of them have shown interest in using word embeddings especially Word2Vec as the features for text classification tasks. Its ability to model high-quality distributional semantics among words has contributed to its success in many of the functions. Despite the success, Word2Vec features are high dimensional that lead to an increase in the complexity of the classifier. In this paper, an effective method for feature extraction based on Word2Vec is proposed for sentiment analysis. The process discovers polarity clusters of the terms in the vocabulary through Word2Vec and opinion lexical dictionary. The features vector for each text is constructed from the polarity clusters, which lead to a lower-dimensional vector to represent the text. This paper also investigates the effect of two opinion lexical dictionaries on the performance of sentiment analysis, and one of the dictionaries are created based on SentiWordNet. The effectiveness of the proposed method is evaluated on the IMDB with two classifiers, namely the Logistic Regression and the Support Vector Machine. The result is promising, showing that the proposed method can be more effective than the baseline approaches. University of Malaya * Faculty of Computer Science and Information Technology 2020 Article PeerReviewed M.Alshari, Eissa and Azman, Azreen and Doraisamy, Shyamala and Mustapha, Norwati and Alksher, Mostafa (2020) Senti2Vec: an effective feature extraction technique for sentiment analysis based on Word2Vec. Malaysian Journal of Computer Science, 33 (3). 240 - 251. ISSN 0127-9084 https://ejournal.um.edu.my/index.php/MJCS/article/view/25280 10.22452/mjcs.vol33no3.5
institution Universiti Putra Malaysia
building UPM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Putra Malaysia
content_source UPM Institutional Repository
url_provider http://psasir.upm.edu.my/
description The discovery of an active feature extraction technique has been the focus of many researchers to improve the performance of classification methods, such as for sentiment analysis. Many of them have shown interest in using word embeddings especially Word2Vec as the features for text classification tasks. Its ability to model high-quality distributional semantics among words has contributed to its success in many of the functions. Despite the success, Word2Vec features are high dimensional that lead to an increase in the complexity of the classifier. In this paper, an effective method for feature extraction based on Word2Vec is proposed for sentiment analysis. The process discovers polarity clusters of the terms in the vocabulary through Word2Vec and opinion lexical dictionary. The features vector for each text is constructed from the polarity clusters, which lead to a lower-dimensional vector to represent the text. This paper also investigates the effect of two opinion lexical dictionaries on the performance of sentiment analysis, and one of the dictionaries are created based on SentiWordNet. The effectiveness of the proposed method is evaluated on the IMDB with two classifiers, namely the Logistic Regression and the Support Vector Machine. The result is promising, showing that the proposed method can be more effective than the baseline approaches.
format Article
author M.Alshari, Eissa
Azman, Azreen
Doraisamy, Shyamala
Mustapha, Norwati
Alksher, Mostafa
spellingShingle M.Alshari, Eissa
Azman, Azreen
Doraisamy, Shyamala
Mustapha, Norwati
Alksher, Mostafa
Senti2Vec: an effective feature extraction technique for sentiment analysis based on Word2Vec
author_facet M.Alshari, Eissa
Azman, Azreen
Doraisamy, Shyamala
Mustapha, Norwati
Alksher, Mostafa
author_sort M.Alshari, Eissa
title Senti2Vec: an effective feature extraction technique for sentiment analysis based on Word2Vec
title_short Senti2Vec: an effective feature extraction technique for sentiment analysis based on Word2Vec
title_full Senti2Vec: an effective feature extraction technique for sentiment analysis based on Word2Vec
title_fullStr Senti2Vec: an effective feature extraction technique for sentiment analysis based on Word2Vec
title_full_unstemmed Senti2Vec: an effective feature extraction technique for sentiment analysis based on Word2Vec
title_sort senti2vec: an effective feature extraction technique for sentiment analysis based on word2vec
publisher University of Malaya * Faculty of Computer Science and Information Technology
publishDate 2020
url http://psasir.upm.edu.my/id/eprint/85796/
https://ejournal.um.edu.my/index.php/MJCS/article/view/25280
_version_ 1778163712122683392
score 13.188404