Staff View: Development of multilingual social media data corpus for sentiment classification

Development of multilingual social media data corpus for sentiment classification

The purpose of this study is to develop a corpus, which consists of 2 (two) languages: Bahasa Indonesia and Bahasa Melayu. In both languages, there are several similar vocabularies but have different meanings. The data used on this corpus, taken from social media that is Twitter and Facebook. Each l...

Full description

Saved in:

Bibliographic Details
Main Authors:	Rumaisa, Fitrah, Basiron, Halizah, Saaya, Zurina
Format:	Article
Language:	English
Published:	Institute of Advanced Scientific Research, Inc. 2019
Online Access:	http://eprints.utem.edu.my/id/eprint/24479/2/ARTICLE-FITRAH-JARDCS.PDF http://eprints.utem.edu.my/id/eprint/24479/ https://www.jardcs.org/abstract.php?id=794
Tags:	Add Tag No Tags, Be the first to tag this record!

id	my.utem.eprints.24479
record_format	eprints
spelling	my.utem.eprints.244792023-07-12T11:24:39Z http://eprints.utem.edu.my/id/eprint/24479/ Development of multilingual social media data corpus for sentiment classification Rumaisa, Fitrah Basiron, Halizah Saaya, Zurina The purpose of this study is to develop a corpus, which consists of 2 (two) languages: Bahasa Indonesia and Bahasa Melayu. In both languages, there are several similar vocabularies but have different meanings. The data used on this corpus, taken from social media that is Twitter and Facebook. Each language has 2100 words collected. After manual selection of words, there are 300 vocabularies that have different meanings. The words will be formed into the core of the formed corpus, regardless of the remaining words. This corpus will density on the polarity of each word per language type using automatic-annotation. So that will be formed two corpuses namely Bahasa Indonesia and Bahasa Melayu. This Corpus will be used in subsequent research on sentence-level annotation and demonstrated using manual annotations using human annotators. Institute of Advanced Scientific Research, Inc. 2019 Article PeerReviewed text en http://eprints.utem.edu.my/id/eprint/24479/2/ARTICLE-FITRAH-JARDCS.PDF Rumaisa, Fitrah and Basiron, Halizah and Saaya, Zurina (2019) Development of multilingual social media data corpus for sentiment classification. Journal of Advanced Research in Dynamical and Control Systems, 11 (3). 286 - 293. ISSN 1943-023X https://www.jardcs.org/abstract.php?id=794
institution	Universiti Teknikal Malaysia Melaka
building	UTEM Library
collection	Institutional Repository
continent	Asia
country	Malaysia
content_provider	Universiti Teknikal Malaysia Melaka
content_source	UTEM Institutional Repository
url_provider	http://eprints.utem.edu.my/
language	English
description	The purpose of this study is to develop a corpus, which consists of 2 (two) languages: Bahasa Indonesia and Bahasa Melayu. In both languages, there are several similar vocabularies but have different meanings. The data used on this corpus, taken from social media that is Twitter and Facebook. Each language has 2100 words collected. After manual selection of words, there are 300 vocabularies that have different meanings. The words will be formed into the core of the formed corpus, regardless of the remaining words. This corpus will density on the polarity of each word per language type using automatic-annotation. So that will be formed two corpuses namely Bahasa Indonesia and Bahasa Melayu. This Corpus will be used in subsequent research on sentence-level annotation and demonstrated using manual annotations using human annotators.
format	Article
author	Rumaisa, Fitrah Basiron, Halizah Saaya, Zurina
spellingShingle	Rumaisa, Fitrah Basiron, Halizah Saaya, Zurina Development of multilingual social media data corpus for sentiment classification
author_facet	Rumaisa, Fitrah Basiron, Halizah Saaya, Zurina
author_sort	Rumaisa, Fitrah
title	Development of multilingual social media data corpus for sentiment classification
title_short	Development of multilingual social media data corpus for sentiment classification
title_full	Development of multilingual social media data corpus for sentiment classification
title_fullStr	Development of multilingual social media data corpus for sentiment classification
title_full_unstemmed	Development of multilingual social media data corpus for sentiment classification
title_sort	development of multilingual social media data corpus for sentiment classification
publisher	Institute of Advanced Scientific Research, Inc.
publishDate	2019
url	http://eprints.utem.edu.my/id/eprint/24479/2/ARTICLE-FITRAH-JARDCS.PDF http://eprints.utem.edu.my/id/eprint/24479/ https://www.jardcs.org/abstract.php?id=794
_version_	1772816017521639424
score	13.160551

Development of multilingual social media data corpus for sentiment classification

Similar Items