Qur'anic words stemming

Arabic words are known to have complex morphological structure. The different structures produce various word patterns or derivatives from a root word. This paper attempts to identify various word patterns that originate from a root word. These word patterns are compared to the words in the 30th par...

Full description

Saved in:
Bibliographic Details
Main Authors: Raja Yusof, Raja Jamilah, Zainuddin, R., Baba, Mohd Sapiyan, Yusoff, Z.M.
Format: Article
Published: Springer 2010
Subjects:
Online Access:http://eprints.um.edu.my/5673/
https://pdfs.semanticscholar.org/fec6/3fee7f1ab03ed674cdf6ea2cd3560d46ef17.pdf?_ga=2.68002260.498228766.1581304948-1568696367.1580870149
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.um.eprints.5673
record_format eprints
spelling my.um.eprints.56732021-04-30T00:43:29Z http://eprints.um.edu.my/5673/ Qur'anic words stemming Raja Yusof, Raja Jamilah Zainuddin, R. Baba, Mohd Sapiyan Yusoff, Z.M. T Technology (General) Arabic words are known to have complex morphological structure. The different structures produce various word patterns or derivatives from a root word. This paper attempts to identify various word patterns that originate from a root word. These word patterns are compared to the words in the 30th part of the Qur'an. Nine stemming test cases were outlined for words in the 30 th part of the Qur'an. Analysis showed that stemming nouns and particles leads to a lower percentage error compared to stemming the 10 alphabets that can be added as affixes in a root word. A rule-based stemming engine (RSE) was also implemented and the stemming accuracy achieved was 62.5 and the average time taken to stem 1000 word tokens was 11.7ms. The accuracy of the results was comparable to other stemming engines such as the Khoja stemmer, Buckwalter Morphological Analyzer (BAMA), Tri-literal Root Extraction (TRE) algorithm, and Voting algorithm. Springer 2010 Article PeerReviewed Raja Yusof, Raja Jamilah and Zainuddin, R. and Baba, Mohd Sapiyan and Yusoff, Z.M. (2010) Qur'anic words stemming. Arabian Journal for Science and Engineering, 35 (2C). pp. 37-49. ISSN 2193-567X https://pdfs.semanticscholar.org/fec6/3fee7f1ab03ed674cdf6ea2cd3560d46ef17.pdf?_ga=2.68002260.498228766.1581304948-1568696367.1580870149
institution Universiti Malaya
building UM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Malaya
content_source UM Research Repository
url_provider http://eprints.um.edu.my/
topic T Technology (General)
spellingShingle T Technology (General)
Raja Yusof, Raja Jamilah
Zainuddin, R.
Baba, Mohd Sapiyan
Yusoff, Z.M.
Qur'anic words stemming
description Arabic words are known to have complex morphological structure. The different structures produce various word patterns or derivatives from a root word. This paper attempts to identify various word patterns that originate from a root word. These word patterns are compared to the words in the 30th part of the Qur'an. Nine stemming test cases were outlined for words in the 30 th part of the Qur'an. Analysis showed that stemming nouns and particles leads to a lower percentage error compared to stemming the 10 alphabets that can be added as affixes in a root word. A rule-based stemming engine (RSE) was also implemented and the stemming accuracy achieved was 62.5 and the average time taken to stem 1000 word tokens was 11.7ms. The accuracy of the results was comparable to other stemming engines such as the Khoja stemmer, Buckwalter Morphological Analyzer (BAMA), Tri-literal Root Extraction (TRE) algorithm, and Voting algorithm.
format Article
author Raja Yusof, Raja Jamilah
Zainuddin, R.
Baba, Mohd Sapiyan
Yusoff, Z.M.
author_facet Raja Yusof, Raja Jamilah
Zainuddin, R.
Baba, Mohd Sapiyan
Yusoff, Z.M.
author_sort Raja Yusof, Raja Jamilah
title Qur'anic words stemming
title_short Qur'anic words stemming
title_full Qur'anic words stemming
title_fullStr Qur'anic words stemming
title_full_unstemmed Qur'anic words stemming
title_sort qur'anic words stemming
publisher Springer
publishDate 2010
url http://eprints.um.edu.my/5673/
https://pdfs.semanticscholar.org/fec6/3fee7f1ab03ed674cdf6ea2cd3560d46ef17.pdf?_ga=2.68002260.498228766.1581304948-1568696367.1580870149
_version_ 1698697302095429632
score 13.15806