To study the performance of stemming algorithm on Malay words beginning with the letter "S" / Rohana Jantan
This thesis concerns the study of Malay stemming algorithm for the word beginning with the letter "S". This algorithm is used in the Malay language document that is used is the Quran translated document. A Malay stemming algorithm known as RulesApplication-Order (RAO) is applied in the exp...
Saved in:
Main Author: | |
---|---|
Format: | Thesis |
Language: | English |
Published: |
2000
|
Subjects: | |
Online Access: | https://ir.uitm.edu.my/id/eprint/98222/1/98222.PDF https://ir.uitm.edu.my/id/eprint/98222/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
id |
my.uitm.ir.98222 |
---|---|
record_format |
eprints |
spelling |
my.uitm.ir.982222024-08-05T03:41:48Z https://ir.uitm.edu.my/id/eprint/98222/ To study the performance of stemming algorithm on Malay words beginning with the letter "S" / Rohana Jantan Jantan, Rohana Malaysia This thesis concerns the study of Malay stemming algorithm for the word beginning with the letter "S". This algorithm is used in the Malay language document that is used is the Quran translated document. A Malay stemming algorithm known as RulesApplication-Order (RAO) is applied in the experiment. In the experiments dictionaries of Malay root words and combination of morphological rules also used. The performance of the Malay stemming algorithm is evaluated by applying to the "S" word by removing different combination of prefixes. The "S" words or the resulted stemmed words are checked for their existences in the dictionaries. If these words do exist, the following stemming processes stop. These words are then analyzed. In the analysis, the percentage of each combination is compared to find the best prefixes combination. The result shows that there is still problem of overstemming, understemming and unstemming of word. For a total of unique 411 "S" words there are 0.73% overstemming, 0.73% understemming and 2.68% unstemmed words. Therefore, the algorithm must be modified in order to increase the performance of the stemming algorithm for Malay words. 2000 Thesis NonPeerReviewed text en https://ir.uitm.edu.my/id/eprint/98222/1/98222.PDF To study the performance of stemming algorithm on Malay words beginning with the letter "S" / Rohana Jantan. (2000) Degree thesis, thesis, Universiti Teknologi MARA (UiTM). |
institution |
Universiti Teknologi Mara |
building |
Tun Abdul Razak Library |
collection |
Institutional Repository |
continent |
Asia |
country |
Malaysia |
content_provider |
Universiti Teknologi Mara |
content_source |
UiTM Institutional Repository |
url_provider |
http://ir.uitm.edu.my/ |
language |
English |
topic |
Malaysia |
spellingShingle |
Malaysia Jantan, Rohana To study the performance of stemming algorithm on Malay words beginning with the letter "S" / Rohana Jantan |
description |
This thesis concerns the study of Malay stemming algorithm for the word beginning with the letter "S". This algorithm is used in the Malay language document that is used is the Quran translated document. A Malay stemming algorithm known as RulesApplication-Order (RAO) is applied in the experiment. In the experiments dictionaries of Malay root words and combination of morphological rules also used. The performance of the Malay stemming algorithm is evaluated by applying to the "S" word by removing different combination of prefixes. The "S" words or the resulted stemmed words are checked for their existences in the dictionaries. If these words do exist, the following stemming processes stop. These words are then analyzed. In the analysis, the percentage of each combination is compared to find the best prefixes combination. The result shows that there is still problem of overstemming, understemming and unstemming of word. For a total of unique 411 "S" words there are 0.73% overstemming, 0.73% understemming and 2.68% unstemmed words. Therefore, the algorithm must be modified in order to increase the performance of the stemming algorithm for Malay words. |
format |
Thesis |
author |
Jantan, Rohana |
author_facet |
Jantan, Rohana |
author_sort |
Jantan, Rohana |
title |
To study the performance of stemming algorithm on Malay words beginning with the letter "S" / Rohana Jantan |
title_short |
To study the performance of stemming algorithm on Malay words beginning with the letter "S" / Rohana Jantan |
title_full |
To study the performance of stemming algorithm on Malay words beginning with the letter "S" / Rohana Jantan |
title_fullStr |
To study the performance of stemming algorithm on Malay words beginning with the letter "S" / Rohana Jantan |
title_full_unstemmed |
To study the performance of stemming algorithm on Malay words beginning with the letter "S" / Rohana Jantan |
title_sort |
to study the performance of stemming algorithm on malay words beginning with the letter "s" / rohana jantan |
publishDate |
2000 |
url |
https://ir.uitm.edu.my/id/eprint/98222/1/98222.PDF https://ir.uitm.edu.my/id/eprint/98222/ |
_version_ |
1806692692496744448 |
score |
13.211869 |