Staff View: CLASENTI: A class-specific sentiment analysis framework

CLASENTI: A class-specific sentiment analysis framework

Arabic text sentiment analysis suffers from low accuracy due to Arabic-specific challenges (e.g., limited resources, morphological complexity, and dialects) and general linguistic issues (e.g., fuzziness, implicit sentiment, sarcasm, and spam). The limited resources problem requires efforts to build...

Full description

Saved in:

Bibliographic Details
Main Authors:	Hamdi, Ali, Shaban, Khaled Bashir, Zainal, Anazida
Format:	Article
Published:	Association for Computing Machinery 2018
Subjects:	QA75 Electronic computers. Computer science
Online Access:	http://eprints.utm.my/id/eprint/84314/ https://doi.org/10.1145/3209885
Tags:	Add Tag No Tags, Be the first to tag this record!

id	my.utm.84314
record_format	eprints
spelling	my.utm.843142019-12-28T01:46:39Z http://eprints.utm.my/id/eprint/84314/ CLASENTI: A class-specific sentiment analysis framework Hamdi, Ali Shaban, Khaled Bashir Zainal, Anazida QA75 Electronic computers. Computer science Arabic text sentiment analysis suffers from low accuracy due to Arabic-specific challenges (e.g., limited resources, morphological complexity, and dialects) and general linguistic issues (e.g., fuzziness, implicit sentiment, sarcasm, and spam). The limited resources problem requires efforts to build new and improved Arabic corpora and lexica. We propose a class-specific sentiment analysis (CLASENTI) framework. The framework includes a new annotation approach to build multi-faceted Arabic corpus and lexicon allowing for simultaneous annotation of different facets, including domains, dialects, linguistic issues, and polarity strengths. Each of these facets has multiple classes (e.g., the nine classes representing dialects found in the Arab world). The new corpus and lexicon annotations facilitate the development of new class-specific classification models and polarity strength calculation. For the new sentiment classification models, we propose a hybrid model combining corpus-based and lexicon-based models. The corpus-based model has two interrelated phases to build; (1) full-corpus classification models for all facets; and (2) class-specific models trained on filtered subsets of the corpus according to the performances of the full-corpus models. To calculate polarity strengths, the lexicon-based model filters the annotated lexicon based on the specific classes of the domain and dialect. As a case study, we collect and annotate 15274 reviews from various sources, including surveys, Facebook comments, and Twitter posts, pertaining to governmental services. In addition, we develop a new web-based application to apply the proposed framework on the case study. CLASENTI framework reaches up to 95% accuracy and 93% F1-Score surpassing the best-known sentiment classifiers implemented in Scikit-learn library that achieve 82% accuracy and 81% F1-Score for Arabic when tested on the same dataset. Association for Computing Machinery 2018 Article PeerReviewed Hamdi, Ali and Shaban, Khaled Bashir and Zainal, Anazida (2018) CLASENTI: A class-specific sentiment analysis framework. ACM Transactions on Asian and Low-Resource Language Information Processing, 17 (4). p. 32. ISSN 2375-4699 https://doi.org/10.1145/3209885
institution	Universiti Teknologi Malaysia
building	UTM Library
collection	Institutional Repository
continent	Asia
country	Malaysia
content_provider	Universiti Teknologi Malaysia
content_source	UTM Institutional Repository
url_provider	http://eprints.utm.my/
topic	QA75 Electronic computers. Computer science
spellingShingle	QA75 Electronic computers. Computer science Hamdi, Ali Shaban, Khaled Bashir Zainal, Anazida CLASENTI: A class-specific sentiment analysis framework
description	Arabic text sentiment analysis suffers from low accuracy due to Arabic-specific challenges (e.g., limited resources, morphological complexity, and dialects) and general linguistic issues (e.g., fuzziness, implicit sentiment, sarcasm, and spam). The limited resources problem requires efforts to build new and improved Arabic corpora and lexica. We propose a class-specific sentiment analysis (CLASENTI) framework. The framework includes a new annotation approach to build multi-faceted Arabic corpus and lexicon allowing for simultaneous annotation of different facets, including domains, dialects, linguistic issues, and polarity strengths. Each of these facets has multiple classes (e.g., the nine classes representing dialects found in the Arab world). The new corpus and lexicon annotations facilitate the development of new class-specific classification models and polarity strength calculation. For the new sentiment classification models, we propose a hybrid model combining corpus-based and lexicon-based models. The corpus-based model has two interrelated phases to build; (1) full-corpus classification models for all facets; and (2) class-specific models trained on filtered subsets of the corpus according to the performances of the full-corpus models. To calculate polarity strengths, the lexicon-based model filters the annotated lexicon based on the specific classes of the domain and dialect. As a case study, we collect and annotate 15274 reviews from various sources, including surveys, Facebook comments, and Twitter posts, pertaining to governmental services. In addition, we develop a new web-based application to apply the proposed framework on the case study. CLASENTI framework reaches up to 95% accuracy and 93% F1-Score surpassing the best-known sentiment classifiers implemented in Scikit-learn library that achieve 82% accuracy and 81% F1-Score for Arabic when tested on the same dataset.
format	Article
author	Hamdi, Ali Shaban, Khaled Bashir Zainal, Anazida
author_facet	Hamdi, Ali Shaban, Khaled Bashir Zainal, Anazida
author_sort	Hamdi, Ali
title	CLASENTI: A class-specific sentiment analysis framework
title_short	CLASENTI: A class-specific sentiment analysis framework
title_full	CLASENTI: A class-specific sentiment analysis framework
title_fullStr	CLASENTI: A class-specific sentiment analysis framework
title_full_unstemmed	CLASENTI: A class-specific sentiment analysis framework
title_sort	clasenti: a class-specific sentiment analysis framework
publisher	Association for Computing Machinery
publishDate	2018
url	http://eprints.utm.my/id/eprint/84314/ https://doi.org/10.1145/3209885
_version_	1654960070098681856
score	13.211869

CLASENTI: A class-specific sentiment analysis framework

Similar Items