Analysis of Feature Selection Methods for Sentiment Analysis Concerning Covid-19 Vaccination Issues

Sentiment analysis or opinion mining is a computational study of a person's opinions, sentiments, evaluations, attitudes, moods, and emotions. Sentiment analysis is one of the most active research areas in natural language processing, data mining, information retrieval, and web mining. One of...

Full description

Saved in:
Bibliographic Details
Main Authors: Muhammad, Fajar, Tri Basuki, Kurniawan, Edi Surya, Negara Harahap
Format: Article
Language:English
Published: INTI International University 2023
Subjects:
Online Access:http://eprints.intimal.edu.my/1729/1/jods2023_03.pdf
http://eprints.intimal.edu.my/1729/
http://ipublishing.intimal.edu.my/jods.html
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Sentiment analysis or opinion mining is a computational study of a person's opinions, sentiments, evaluations, attitudes, moods, and emotions. Sentiment analysis is one of the most active research areas in natural language processing, data mining, information retrieval, and web mining. One of the problems identified in the sentiment analysis process is the massive amount of data or text properties. In sentiment analysis, each word or term is collected into properties or dimensions, forming a data table. Due to the vast number of terms, this causes the process to take too long and requires a computer with tremendous power or ability. In addition, this can lead to a decrease in the quality of the model because data that is too large will also provide a significant bias value. Not all terms have contributions or relationships to decisions or labels in the form of positive, negative, and neutral values. For this reason, the feature selection method will be used in this study to select features or terms that contribute more to decisions or labels. It is also hoped that this can increase the quality of the prediction model that will be formed. In this study, the author will continue the research from another researcher by adding a feature selection process, such as two algorithms from the filtered method, chi-square, and information gain, and one algorithm from the wrapped method, which is Genetic Algorithms (GA). The experiment result shows that the GA obtained result has the highest accurate value compared to the other methods.