Enhancement of sentiment analysis scoring mechanism – A case study on Malaysian Airline industry / Rayvendran Visvalingam

Sentiment polarity calculation is a method to gage the strength of a sentiment extracted from a text. Many tools have been developed with their respective scoring mechanism in order to produce an effective sentiment score. Semantic Orientation Calculator (SO-CAL) is one of the lexicon-based tool tha...

Full description

Saved in:
Bibliographic Details
Main Author: Rayvendran , Visvalingam
Format: Thesis
Published: 2017
Subjects:
Online Access:http://studentsrepo.um.edu.my/10808/2/Rayvendran.pdf
http://studentsrepo.um.edu.my/10808/1/Rayvendran.pdf
http://studentsrepo.um.edu.my/10808/
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.um.stud.10808
record_format eprints
spelling my.um.stud.108082020-01-18T02:01:26Z Enhancement of sentiment analysis scoring mechanism – A case study on Malaysian Airline industry / Rayvendran Visvalingam Rayvendran , Visvalingam HD Industries. Land use. Labor QA75 Electronic computers. Computer science Sentiment polarity calculation is a method to gage the strength of a sentiment extracted from a text. Many tools have been developed with their respective scoring mechanism in order to produce an effective sentiment score. Semantic Orientation Calculator (SO-CAL) is one of the lexicon-based tool that is incorporated with important features (such as intensifiers, negation and etc.) to calculate the sentiment polarity of a text. However, this tool has its limitation in processing misspelled word especially in repeated letters or characters that may lead to sentiment inaccuracy. The accuracy of SO-CAL is when processing social media text that mostly contains misspelled word is low. Thus, an enhanced scoring mechanism (LexiPro-SM) was developed to improve the sentiment scoring considering misspelled word especially on words that contain repeated letters. The LexiPro-SM was tested on the posts that were collected from the Facebook official pages of two major airline industries in Malaysia, which will be referred to Airline A and Airline B respectively. Three important phases were involved the development of LexiPro-SM which are, data collection, data cleaning and data analysis. Data collection was performed with the aid of Facebook Graph API to collect three months’ posts from the both airlines. Data cleaning was performed by removing noise leaving only text that contains alphabets and exclamation mark. Improvement was made on the scoring mechanism and incorporated in LexiPro-SM with the features that can process misspelled word and also other improved features such as negation and exclamation mark. Then clean data of the airline was analyzed with LexiPro-SM and SO-CAL. A web-based portal was developed to visualize the LexiPro-SM’s result of the two airlines, where each airline has own page with overall score chart, polarity group chart and sub-services chart. Sub-services chart is a new idea implemented in this research to categorize the overall services into sub-services such as customer service, price, preflight and facility. This would be helpful for the airline management to improve their service by narrowing down their attention into a particular service. The airline pages are also linked in order to show the comparison results between Airline A and Airline B. Based on these results, a case study was conducted between the two airlines where the observation shows that Airline A achieved a high positive score than Airline B. Moreover, to assess the effectiveness of LexiPro-SM , the both results of LexiPro-SM and SO-CAL was compared by performed evaluation measures using evaluation metrics (such as accuracy, recall, precision and F1-score) with the reference of human expert results. From the evaluation it shows LexiPro-SM achieved higher accuracy (90.7%) than SO-CAL (58.33%). Overall, in LexiPro-SM the improvement made has increased the accuracy of sentiment detection and produced a better result than SO-CAL. This concludes processing misspelled word is an important process in social media sentiment analysis. This is further proved with the reference to the case study, where a conclusion was formed as Airline A providing a better service than Airline B. 2017-11 Thesis NonPeerReviewed application/pdf http://studentsrepo.um.edu.my/10808/2/Rayvendran.pdf application/pdf http://studentsrepo.um.edu.my/10808/1/Rayvendran.pdf Rayvendran , Visvalingam (2017) Enhancement of sentiment analysis scoring mechanism – A case study on Malaysian Airline industry / Rayvendran Visvalingam. Masters thesis, University of Malaya. http://studentsrepo.um.edu.my/10808/
institution Universiti Malaya
building UM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Malaya
content_source UM Student Repository
url_provider http://studentsrepo.um.edu.my/
topic HD Industries. Land use. Labor
QA75 Electronic computers. Computer science
spellingShingle HD Industries. Land use. Labor
QA75 Electronic computers. Computer science
Rayvendran , Visvalingam
Enhancement of sentiment analysis scoring mechanism – A case study on Malaysian Airline industry / Rayvendran Visvalingam
description Sentiment polarity calculation is a method to gage the strength of a sentiment extracted from a text. Many tools have been developed with their respective scoring mechanism in order to produce an effective sentiment score. Semantic Orientation Calculator (SO-CAL) is one of the lexicon-based tool that is incorporated with important features (such as intensifiers, negation and etc.) to calculate the sentiment polarity of a text. However, this tool has its limitation in processing misspelled word especially in repeated letters or characters that may lead to sentiment inaccuracy. The accuracy of SO-CAL is when processing social media text that mostly contains misspelled word is low. Thus, an enhanced scoring mechanism (LexiPro-SM) was developed to improve the sentiment scoring considering misspelled word especially on words that contain repeated letters. The LexiPro-SM was tested on the posts that were collected from the Facebook official pages of two major airline industries in Malaysia, which will be referred to Airline A and Airline B respectively. Three important phases were involved the development of LexiPro-SM which are, data collection, data cleaning and data analysis. Data collection was performed with the aid of Facebook Graph API to collect three months’ posts from the both airlines. Data cleaning was performed by removing noise leaving only text that contains alphabets and exclamation mark. Improvement was made on the scoring mechanism and incorporated in LexiPro-SM with the features that can process misspelled word and also other improved features such as negation and exclamation mark. Then clean data of the airline was analyzed with LexiPro-SM and SO-CAL. A web-based portal was developed to visualize the LexiPro-SM’s result of the two airlines, where each airline has own page with overall score chart, polarity group chart and sub-services chart. Sub-services chart is a new idea implemented in this research to categorize the overall services into sub-services such as customer service, price, preflight and facility. This would be helpful for the airline management to improve their service by narrowing down their attention into a particular service. The airline pages are also linked in order to show the comparison results between Airline A and Airline B. Based on these results, a case study was conducted between the two airlines where the observation shows that Airline A achieved a high positive score than Airline B. Moreover, to assess the effectiveness of LexiPro-SM , the both results of LexiPro-SM and SO-CAL was compared by performed evaluation measures using evaluation metrics (such as accuracy, recall, precision and F1-score) with the reference of human expert results. From the evaluation it shows LexiPro-SM achieved higher accuracy (90.7%) than SO-CAL (58.33%). Overall, in LexiPro-SM the improvement made has increased the accuracy of sentiment detection and produced a better result than SO-CAL. This concludes processing misspelled word is an important process in social media sentiment analysis. This is further proved with the reference to the case study, where a conclusion was formed as Airline A providing a better service than Airline B.
format Thesis
author Rayvendran , Visvalingam
author_facet Rayvendran , Visvalingam
author_sort Rayvendran , Visvalingam
title Enhancement of sentiment analysis scoring mechanism – A case study on Malaysian Airline industry / Rayvendran Visvalingam
title_short Enhancement of sentiment analysis scoring mechanism – A case study on Malaysian Airline industry / Rayvendran Visvalingam
title_full Enhancement of sentiment analysis scoring mechanism – A case study on Malaysian Airline industry / Rayvendran Visvalingam
title_fullStr Enhancement of sentiment analysis scoring mechanism – A case study on Malaysian Airline industry / Rayvendran Visvalingam
title_full_unstemmed Enhancement of sentiment analysis scoring mechanism – A case study on Malaysian Airline industry / Rayvendran Visvalingam
title_sort enhancement of sentiment analysis scoring mechanism – a case study on malaysian airline industry / rayvendran visvalingam
publishDate 2017
url http://studentsrepo.um.edu.my/10808/2/Rayvendran.pdf
http://studentsrepo.um.edu.my/10808/1/Rayvendran.pdf
http://studentsrepo.um.edu.my/10808/
_version_ 1738506413408256000
score 13.209306