To improve stemming algorithm on Malay words begin with alphabet B / Norasiah Ismail
This thesis concerns a Malay language document retrieval system. Stemming algorithm, Malay Quran translated documents and root dictionaries are used in order to complete this study. The performance on words beginning with letter 'b' of Malay stemming algorithm are tested using 5 experiment...
Saved in:
Main Author: | |
---|---|
Format: | Thesis |
Language: | English |
Published: |
2001
|
Subjects: | |
Online Access: | https://ir.uitm.edu.my/id/eprint/98076/1/98076.pdf https://ir.uitm.edu.my/id/eprint/98076/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
id |
my.uitm.ir.98076 |
---|---|
record_format |
eprints |
spelling |
my.uitm.ir.980762024-08-21T23:27:35Z https://ir.uitm.edu.my/id/eprint/98076/ To improve stemming algorithm on Malay words begin with alphabet B / Norasiah Ismail Ismail, Norasiah Analysis This thesis concerns a Malay language document retrieval system. Stemming algorithm, Malay Quran translated documents and root dictionaries are used in order to complete this study. The performance on words beginning with letter 'b' of Malay stemming algorithm are tested using 5 experiments. First experiment is use the original set of data collections. In second experiment, affixes rule are added in rule format in file "rule.txt". Third experiments are modifying the total value for V dictionary in header file "dcvarnew.h". For fourth experiment, a new word is adding in the dictionary and modifies Malay Quran translated. In fifth experiment, the total value for 'a' dictionary in header file "dcvarnew.h" is modifying. The main objective of these experiments is to minimize the unstemming, understemming, overstemming, spelling exception and other problems that occurred when 'b' words are stemmed. The objective is achieved when the best order of the rules to use to stem the words that beginning with 'b' is met. This involves the use of two combinations simultaneously such as the pair combination of prefix-suffix-prefix suffix-infix as primary combinations and prefix suffix-suffix-prefix-infix as the secondary. First, all the words used the prefix-suffix-prefix suffix-infix combination, and if the program encountered that the words can not be solved correctly, combination will be shifted to the secondary combination that is prefix suffix-suffix-prefix-infix combination. These experiments can serves as a benchmark for future research in Malay language in finding the best approach to stem words that begin with other rest of alphabets. 2001 Thesis NonPeerReviewed text en https://ir.uitm.edu.my/id/eprint/98076/1/98076.pdf To improve stemming algorithm on Malay words begin with alphabet B / Norasiah Ismail. (2001) Degree thesis, thesis, Universiti Teknologi MARA (UiTM). |
institution |
Universiti Teknologi Mara |
building |
Tun Abdul Razak Library |
collection |
Institutional Repository |
continent |
Asia |
country |
Malaysia |
content_provider |
Universiti Teknologi Mara |
content_source |
UiTM Institutional Repository |
url_provider |
http://ir.uitm.edu.my/ |
language |
English |
topic |
Analysis |
spellingShingle |
Analysis Ismail, Norasiah To improve stemming algorithm on Malay words begin with alphabet B / Norasiah Ismail |
description |
This thesis concerns a Malay language document retrieval system. Stemming algorithm, Malay Quran translated documents and root dictionaries are used in order to complete this study. The performance on words beginning with letter 'b' of Malay stemming algorithm are tested using 5 experiments. First experiment is use the original set of data collections. In second experiment, affixes rule are added in rule format in file "rule.txt". Third experiments are modifying the total value for V dictionary in header file "dcvarnew.h". For fourth experiment, a new word is adding in the dictionary and modifies Malay Quran translated. In fifth experiment, the total value for 'a' dictionary in header file "dcvarnew.h" is modifying. The main objective of these experiments is to minimize the unstemming, understemming, overstemming, spelling exception and other problems that occurred when 'b' words are stemmed. The objective is achieved when the best order of the rules to use to stem the words that beginning with 'b' is met. This involves the use of two combinations simultaneously such as the pair combination of prefix-suffix-prefix suffix-infix as primary combinations and prefix suffix-suffix-prefix-infix as the secondary. First, all the words used the prefix-suffix-prefix suffix-infix combination, and if the program encountered that the words can not be solved correctly, combination will be shifted to the secondary combination that is prefix suffix-suffix-prefix-infix combination. These experiments can serves as a benchmark for future research in Malay language in finding the best approach to stem words that begin with other rest of alphabets. |
format |
Thesis |
author |
Ismail, Norasiah |
author_facet |
Ismail, Norasiah |
author_sort |
Ismail, Norasiah |
title |
To improve stemming algorithm on Malay words begin with alphabet B / Norasiah Ismail |
title_short |
To improve stemming algorithm on Malay words begin with alphabet B / Norasiah Ismail |
title_full |
To improve stemming algorithm on Malay words begin with alphabet B / Norasiah Ismail |
title_fullStr |
To improve stemming algorithm on Malay words begin with alphabet B / Norasiah Ismail |
title_full_unstemmed |
To improve stemming algorithm on Malay words begin with alphabet B / Norasiah Ismail |
title_sort |
to improve stemming algorithm on malay words begin with alphabet b / norasiah ismail |
publishDate |
2001 |
url |
https://ir.uitm.edu.my/id/eprint/98076/1/98076.pdf https://ir.uitm.edu.my/id/eprint/98076/ |
_version_ |
1808976016738091008 |
score |
13.211869 |