Qur'anic words stemming

Arabic words are known to have complex morphological structure. The different structures produce various word patterns or derivatives from a root word. This paper attempts to identify various word patterns that originate from a root word. These word patterns are compared to the words in the 30th par...

Full description

Saved in:
Bibliographic Details
Main Authors: Raja Yusof, Raja Jamilah, Zainuddin, R., Baba, Mohd Sapiyan, Yusoff, Z.M.
Format: Article
Published: Springer 2010
Subjects:
Online Access:http://eprints.um.edu.my/5673/
https://pdfs.semanticscholar.org/fec6/3fee7f1ab03ed674cdf6ea2cd3560d46ef17.pdf?_ga=2.68002260.498228766.1581304948-1568696367.1580870149
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Arabic words are known to have complex morphological structure. The different structures produce various word patterns or derivatives from a root word. This paper attempts to identify various word patterns that originate from a root word. These word patterns are compared to the words in the 30th part of the Qur'an. Nine stemming test cases were outlined for words in the 30 th part of the Qur'an. Analysis showed that stemming nouns and particles leads to a lower percentage error compared to stemming the 10 alphabets that can be added as affixes in a root word. A rule-based stemming engine (RSE) was also implemented and the stemming accuracy achieved was 62.5 and the average time taken to stem 1000 word tokens was 11.7ms. The accuracy of the results was comparable to other stemming engines such as the Khoja stemmer, Buckwalter Morphological Analyzer (BAMA), Tri-literal Root Extraction (TRE) algorithm, and Voting algorithm.