Performance evaluation of hybrid feature selection technique for sentiment classification based on food reviews
This paper presents an evaluation of the performance efficiency of sentiment classification using a hybrid feature selection technique. This technique is able to overcome the issue of lack in evaluating features importance by using a combination of TF-IDF+SVM-RFE (Term Frequency-Inverse Document Fre...
Saved in:
Main Authors: | , |
---|---|
Format: | Conference or Workshop Item |
Language: | English English |
Published: |
IEEE
2021
|
Subjects: | |
Online Access: | http://umpir.ump.edu.my/id/eprint/33473/1/Performance%20evaluation%20of%20hybrid%20feature%20selection%20technique_FULL.pdf http://umpir.ump.edu.my/id/eprint/33473/2/Performance%20evaluation%20of%20hybrid%20feature%20selection%20technique.pdf http://umpir.ump.edu.my/id/eprint/33473/ https://doi.org/10.1109/ICSECS52883.2021.00038 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | This paper presents an evaluation of the performance efficiency of sentiment classification using a hybrid feature selection technique. This technique is able to overcome the issue of lack in evaluating features importance by using a combination of TF-IDF+SVM-RFE (Term Frequency-Inverse Document Frequency (TF-IDF) and Supports Vector Machine (SVM-RFE)). Feature importance is measured and significant features are selected recursively based on the number of significant features known as k-top features. We tested this technique with a food reviews dataset from Kaggle to classify a positive and negative review. Finally, SVM has been deployed as a classifier to evaluate the classification performance. The performance is observed based on the accuracy, precision, recall and F-measure. The highest accuracy is 80%, precision is 82%, recall is 76% and F-measure is 79%. Consequently, 24.5% of the features to be classified in this technique have been reduced in obtaining these highest results. Thus, the computational resources are able to be utilized optimally from this reduction and the classification performance efficiency is able to be maintained. |
---|