To improve stemming algorithm on Malay words begin with alphabet B / Norasiah Ismail

This thesis concerns a Malay language document retrieval system. Stemming algorithm, Malay Quran translated documents and root dictionaries are used in order to complete this study. The performance on words beginning with letter 'b' of Malay stemming algorithm are tested using 5 experiment...

Full description

Saved in:
Bibliographic Details
Main Author: Ismail, Norasiah
Format: Thesis
Language:English
Published: 2001
Subjects:
Online Access:https://ir.uitm.edu.my/id/eprint/98076/1/98076.pdf
https://ir.uitm.edu.my/id/eprint/98076/
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.uitm.ir.98076
record_format eprints
spelling my.uitm.ir.980762024-08-21T23:27:35Z https://ir.uitm.edu.my/id/eprint/98076/ To improve stemming algorithm on Malay words begin with alphabet B / Norasiah Ismail Ismail, Norasiah Analysis This thesis concerns a Malay language document retrieval system. Stemming algorithm, Malay Quran translated documents and root dictionaries are used in order to complete this study. The performance on words beginning with letter 'b' of Malay stemming algorithm are tested using 5 experiments. First experiment is use the original set of data collections. In second experiment, affixes rule are added in rule format in file "rule.txt". Third experiments are modifying the total value for V dictionary in header file "dcvarnew.h". For fourth experiment, a new word is adding in the dictionary and modifies Malay Quran translated. In fifth experiment, the total value for 'a' dictionary in header file "dcvarnew.h" is modifying. The main objective of these experiments is to minimize the unstemming, understemming, overstemming, spelling exception and other problems that occurred when 'b' words are stemmed. The objective is achieved when the best order of the rules to use to stem the words that beginning with 'b' is met. This involves the use of two combinations simultaneously such as the pair combination of prefix-suffix-prefix suffix-infix as primary combinations and prefix suffix-suffix-prefix-infix as the secondary. First, all the words used the prefix-suffix-prefix suffix-infix combination, and if the program encountered that the words can not be solved correctly, combination will be shifted to the secondary combination that is prefix suffix-suffix-prefix-infix combination. These experiments can serves as a benchmark for future research in Malay language in finding the best approach to stem words that begin with other rest of alphabets. 2001 Thesis NonPeerReviewed text en https://ir.uitm.edu.my/id/eprint/98076/1/98076.pdf To improve stemming algorithm on Malay words begin with alphabet B / Norasiah Ismail. (2001) Degree thesis, thesis, Universiti Teknologi MARA (UiTM).
institution Universiti Teknologi Mara
building Tun Abdul Razak Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Teknologi Mara
content_source UiTM Institutional Repository
url_provider http://ir.uitm.edu.my/
language English
topic Analysis
spellingShingle Analysis
Ismail, Norasiah
To improve stemming algorithm on Malay words begin with alphabet B / Norasiah Ismail
description This thesis concerns a Malay language document retrieval system. Stemming algorithm, Malay Quran translated documents and root dictionaries are used in order to complete this study. The performance on words beginning with letter 'b' of Malay stemming algorithm are tested using 5 experiments. First experiment is use the original set of data collections. In second experiment, affixes rule are added in rule format in file "rule.txt". Third experiments are modifying the total value for V dictionary in header file "dcvarnew.h". For fourth experiment, a new word is adding in the dictionary and modifies Malay Quran translated. In fifth experiment, the total value for 'a' dictionary in header file "dcvarnew.h" is modifying. The main objective of these experiments is to minimize the unstemming, understemming, overstemming, spelling exception and other problems that occurred when 'b' words are stemmed. The objective is achieved when the best order of the rules to use to stem the words that beginning with 'b' is met. This involves the use of two combinations simultaneously such as the pair combination of prefix-suffix-prefix suffix-infix as primary combinations and prefix suffix-suffix-prefix-infix as the secondary. First, all the words used the prefix-suffix-prefix suffix-infix combination, and if the program encountered that the words can not be solved correctly, combination will be shifted to the secondary combination that is prefix suffix-suffix-prefix-infix combination. These experiments can serves as a benchmark for future research in Malay language in finding the best approach to stem words that begin with other rest of alphabets.
format Thesis
author Ismail, Norasiah
author_facet Ismail, Norasiah
author_sort Ismail, Norasiah
title To improve stemming algorithm on Malay words begin with alphabet B / Norasiah Ismail
title_short To improve stemming algorithm on Malay words begin with alphabet B / Norasiah Ismail
title_full To improve stemming algorithm on Malay words begin with alphabet B / Norasiah Ismail
title_fullStr To improve stemming algorithm on Malay words begin with alphabet B / Norasiah Ismail
title_full_unstemmed To improve stemming algorithm on Malay words begin with alphabet B / Norasiah Ismail
title_sort to improve stemming algorithm on malay words begin with alphabet b / norasiah ismail
publishDate 2001
url https://ir.uitm.edu.my/id/eprint/98076/1/98076.pdf
https://ir.uitm.edu.my/id/eprint/98076/
_version_ 1808976016738091008
score 13.211869