الوصف: Segmentation Word to Improve Performance Sentiment Analysis for Indonesian Language

Segmentation Word to Improve Performance Sentiment Analysis for Indonesian Language

This study explores the enhancement of accuracy in Indonesian sentiment analysis by incorporating text segmentation features during the pre-processing phase. One of the most important steps in creating a highquality Bag of Words is to separate Indonesian sentences with no spacing, which is made pos...

وصف كامل

محفوظ في:

التفاصيل البيبلوغرافية
المؤلفون الرئيسيون:	Siti Mujilahwati, Siti Mujilahwati, M. Safar, Noor Zuraidin, Supriyanto, Catur
التنسيق:	مقال
اللغة:	English
منشور في:	ASPG 2024
الموضوعات:	P302 - 302.87 Discourse analysis
الوصول للمادة أونلاين:	http://eprints.uthm.edu.my/12117/1/J17676_7aa8913f69426b15d1c915a823c6ac83.pdf http://eprints.uthm.edu.my/12117/ https://doi.org/10.54216/FPA.150213
الوسوم:	إضافة وسم لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!

الوصف
الملخص:	This study explores the enhancement of accuracy in Indonesian sentiment analysis by incorporating text segmentation features during the pre-processing phase. One of the most important steps in creating a highquality Bag of Words is to separate Indonesian sentences with no spacing, which is made possible by the created text segmentation algorithm. Through the conducted observations and analyses, it was observed that text comments from social media frequently exhibit connected sentences without spacing. The segmentation process was developed through a matching model utilizing a standard Indonesian word dictionary. Implementation involved testing Indonesian text data related to COVID-19 management, resulting in a substantial increase of 3,036 features. The Bag of Words was then constructed using the Term Frequency-Inverse Document Frequency method. Subsequently, sentiment analysis classification testing was conducted using both deep learning and machine learning models to assess data quality and accuracy. The sentiment analysis accuracy for applying Deep Learning, Support Vector Machine and Naive Bayes is 86.46%, 88.02% and 86.19% respectively.

Segmentation Word to Improve Performance Sentiment Analysis for Indonesian Language

مواد مشابهة