Sentiment analysis of noisy Malay text: State of art, challenges and future work

Sentiment analysis (SA) is a study where people's opinions and emotions are automatically extracted in the form of sentiments from the natural language text. In social media monitoring, it is very useful because it allows user to gain an overall picture of the extensive public opinion behind ma...

Full description

Saved in:
Bibliographic Details
Main Authors: Muhammad Fakhrur Razi Abu Bakar, Norisma Idris, Liyana Shuib, Norazlina Khamis
Format: Article
Language:English
Published: 2020
Online Access:https://eprints.ums.edu.my/id/eprint/25514/1/Sentiment%20analysis%20of%20noisy%20Malay%20text%20State%20of%20art%2C%20challenges%20and%20future%20work.pdf
https://eprints.ums.edu.my/id/eprint/25514/
https://doi.org/10.1109/ACCESS.2020.2968955
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.ums.eprints.25514
record_format eprints
spelling my.ums.eprints.255142020-06-11T23:55:00Z https://eprints.ums.edu.my/id/eprint/25514/ Sentiment analysis of noisy Malay text: State of art, challenges and future work Muhammad Fakhrur Razi Abu Bakar Norisma Idris Liyana Shuib Norazlina Khamis Sentiment analysis (SA) is a study where people's opinions and emotions are automatically extracted in the form of sentiments from the natural language text. In social media monitoring, it is very useful because it allows user to gain an overall picture of the extensive public opinion behind many topics. Most works on SA are for the English text. Only a few works focus on the Malay language. Currently, a review on SA for the Malay language only focus on the SA approaches and the dataset. Some major issues such as the pre-processing techniques used to normalize the noisy text, the most employed performance measures for Malay SA, and the challenges for Malay SA has not been reviewed. Malaysians tend not to fully follow any abbreviations rules when writing on social media. Thus, a lot of noisy text can be found in social media sites like Facebook and Twitter which create some issues to SA process. Hence, the aim of this study is to investigate the state of the art, challenges and future works of SA for Malay social media text. This study provides a review on various approaches, datasets, performance measures, and pre-processing techniques used in the previous works on SA of the Malay text. More than 700 articles from journals and conference proceedings have been identified using the search keywords, however, only 17 relevant articles published from year 2013 to 2018 were reviewed. The findings from this review focus on three commonly used SA approaches which are lexicon-based, machine learning, and hybrid. 2020 Article PeerReviewed text en https://eprints.ums.edu.my/id/eprint/25514/1/Sentiment%20analysis%20of%20noisy%20Malay%20text%20State%20of%20art%2C%20challenges%20and%20future%20work.pdf Muhammad Fakhrur Razi Abu Bakar and Norisma Idris and Liyana Shuib and Norazlina Khamis (2020) Sentiment analysis of noisy Malay text: State of art, challenges and future work. IEEE Access, 8. pp. 24687-24696. ISSN 2169-3536 https://doi.org/10.1109/ACCESS.2020.2968955
institution Universiti Malaysia Sabah
building UMS Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Malaysia Sabah
content_source UMS Institutional Repository
url_provider http://eprints.ums.edu.my/
language English
description Sentiment analysis (SA) is a study where people's opinions and emotions are automatically extracted in the form of sentiments from the natural language text. In social media monitoring, it is very useful because it allows user to gain an overall picture of the extensive public opinion behind many topics. Most works on SA are for the English text. Only a few works focus on the Malay language. Currently, a review on SA for the Malay language only focus on the SA approaches and the dataset. Some major issues such as the pre-processing techniques used to normalize the noisy text, the most employed performance measures for Malay SA, and the challenges for Malay SA has not been reviewed. Malaysians tend not to fully follow any abbreviations rules when writing on social media. Thus, a lot of noisy text can be found in social media sites like Facebook and Twitter which create some issues to SA process. Hence, the aim of this study is to investigate the state of the art, challenges and future works of SA for Malay social media text. This study provides a review on various approaches, datasets, performance measures, and pre-processing techniques used in the previous works on SA of the Malay text. More than 700 articles from journals and conference proceedings have been identified using the search keywords, however, only 17 relevant articles published from year 2013 to 2018 were reviewed. The findings from this review focus on three commonly used SA approaches which are lexicon-based, machine learning, and hybrid.
format Article
author Muhammad Fakhrur Razi Abu Bakar
Norisma Idris
Liyana Shuib
Norazlina Khamis
spellingShingle Muhammad Fakhrur Razi Abu Bakar
Norisma Idris
Liyana Shuib
Norazlina Khamis
Sentiment analysis of noisy Malay text: State of art, challenges and future work
author_facet Muhammad Fakhrur Razi Abu Bakar
Norisma Idris
Liyana Shuib
Norazlina Khamis
author_sort Muhammad Fakhrur Razi Abu Bakar
title Sentiment analysis of noisy Malay text: State of art, challenges and future work
title_short Sentiment analysis of noisy Malay text: State of art, challenges and future work
title_full Sentiment analysis of noisy Malay text: State of art, challenges and future work
title_fullStr Sentiment analysis of noisy Malay text: State of art, challenges and future work
title_full_unstemmed Sentiment analysis of noisy Malay text: State of art, challenges and future work
title_sort sentiment analysis of noisy malay text: state of art, challenges and future work
publishDate 2020
url https://eprints.ums.edu.my/id/eprint/25514/1/Sentiment%20analysis%20of%20noisy%20Malay%20text%20State%20of%20art%2C%20challenges%20and%20future%20work.pdf
https://eprints.ums.edu.my/id/eprint/25514/
https://doi.org/10.1109/ACCESS.2020.2968955
_version_ 1760230378176512000
score 13.160551