AN INTEGRATED GENERIC TEXT CLASSIFICATION ALGORITHM FOR INDONESIAN AND MALAY NEWS DOCUMENTS

Text classification (TC)provides a better wayto organize information since it allows better understanding and interpretation of the content. It deals with the assignment of labels into a group of similar textual document. However, TC research for Asian language documents is relatively limited com...

Full description

Saved in:
Bibliographic Details
Main Author: ,, ZUL INDRA
Format: Thesis
Language:English
Published: 2016
Subjects:
Online Access:http://utpedia.utp.edu.my/id/eprint/21420/1/2015%20-IT%20-%20AN%20INTEGRATED%20GENERIC%20TEXT%20CLASSIFICATION%20ALGORITHM%20FOR%20INDONESIAN%20AND%20MALAY%20NEWS%20DOCUMENT%20-%20ZUL%20INDRA%20-%20MASTER.pdf
http://utpedia.utp.edu.my/id/eprint/21420/
Tags: Add Tag
No Tags, Be the first to tag this record!
id oai:utpedia.utp.edu.my:21420
record_format eprints
spelling oai:utpedia.utp.edu.my:214202024-07-24T07:16:27Z http://utpedia.utp.edu.my/id/eprint/21420/ AN INTEGRATED GENERIC TEXT CLASSIFICATION ALGORITHM FOR INDONESIAN AND MALAY NEWS DOCUMENTS ,, ZUL INDRA QA75 Electronic computers. Computer science Text classification (TC)provides a better wayto organize information since it allows better understanding and interpretation of the content. It deals with the assignment of labels into a group of similar textual document. However, TC research for Asian language documents is relatively limited compared to English documents and even lesser particularly for news articles. Apart from that, TC research to classify textual documents in similar morphology such Indonesian and Malay is still scarce. Hence, the aimof this study is to develop an integrated generic TCalgorithm which is able to identify the language and then classify the category for identified news documents. Furthermore, top-ra feature selection method is utilised to improve TCperformance andto overcome theonline news corpora classification challenges: rapid datagrowth of online news documents, and the high computational time. 2016-07 Thesis NonPeerReviewed application/pdf en http://utpedia.utp.edu.my/id/eprint/21420/1/2015%20-IT%20-%20AN%20INTEGRATED%20GENERIC%20TEXT%20CLASSIFICATION%20ALGORITHM%20FOR%20INDONESIAN%20AND%20MALAY%20NEWS%20DOCUMENT%20-%20ZUL%20INDRA%20-%20MASTER.pdf ,, ZUL INDRA (2016) AN INTEGRATED GENERIC TEXT CLASSIFICATION ALGORITHM FOR INDONESIAN AND MALAY NEWS DOCUMENTS. Masters thesis, Universiti Teknologi PETRONAS.
institution Universiti Teknologi Petronas
building UTP Resource Centre
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Teknologi Petronas
content_source UTP Electronic and Digitized Intellectual Asset
url_provider http://utpedia.utp.edu.my/
language English
topic QA75 Electronic computers. Computer science
spellingShingle QA75 Electronic computers. Computer science
,, ZUL INDRA
AN INTEGRATED GENERIC TEXT CLASSIFICATION ALGORITHM FOR INDONESIAN AND MALAY NEWS DOCUMENTS
description Text classification (TC)provides a better wayto organize information since it allows better understanding and interpretation of the content. It deals with the assignment of labels into a group of similar textual document. However, TC research for Asian language documents is relatively limited compared to English documents and even lesser particularly for news articles. Apart from that, TC research to classify textual documents in similar morphology such Indonesian and Malay is still scarce. Hence, the aimof this study is to develop an integrated generic TCalgorithm which is able to identify the language and then classify the category for identified news documents. Furthermore, top-ra feature selection method is utilised to improve TCperformance andto overcome theonline news corpora classification challenges: rapid datagrowth of online news documents, and the high computational time.
format Thesis
author ,, ZUL INDRA
author_facet ,, ZUL INDRA
author_sort ,, ZUL INDRA
title AN INTEGRATED GENERIC TEXT CLASSIFICATION ALGORITHM FOR INDONESIAN AND MALAY NEWS DOCUMENTS
title_short AN INTEGRATED GENERIC TEXT CLASSIFICATION ALGORITHM FOR INDONESIAN AND MALAY NEWS DOCUMENTS
title_full AN INTEGRATED GENERIC TEXT CLASSIFICATION ALGORITHM FOR INDONESIAN AND MALAY NEWS DOCUMENTS
title_fullStr AN INTEGRATED GENERIC TEXT CLASSIFICATION ALGORITHM FOR INDONESIAN AND MALAY NEWS DOCUMENTS
title_full_unstemmed AN INTEGRATED GENERIC TEXT CLASSIFICATION ALGORITHM FOR INDONESIAN AND MALAY NEWS DOCUMENTS
title_sort integrated generic text classification algorithm for indonesian and malay news documents
publishDate 2016
url http://utpedia.utp.edu.my/id/eprint/21420/1/2015%20-IT%20-%20AN%20INTEGRATED%20GENERIC%20TEXT%20CLASSIFICATION%20ALGORITHM%20FOR%20INDONESIAN%20AND%20MALAY%20NEWS%20DOCUMENT%20-%20ZUL%20INDRA%20-%20MASTER.pdf
http://utpedia.utp.edu.my/id/eprint/21420/
_version_ 1805891031303979008
score 13.19449