Implementation of improved Levenshtein algorithm for spelling correction word candidate list generation
Candidates’ list generation in spelling correction is a process of finding words from a lexicon that are close to the incorrect word. The most widely used algorithm to generate the candidate list is the Levenshtein algorithm. However, the algorithm consumes high computational cost, especially when t...
Saved in:
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
2016
|
Subjects: | |
Online Access: | http://repo.uum.edu.my/20609/1/JTAIT%2088%203%202016%20449%20455.pdf http://repo.uum.edu.my/20609/ http://www.jatit.org/volumes/Vol88No3/10Vol88No3.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
id |
my.uum.repo.20609 |
---|---|
record_format |
eprints |
spelling |
my.uum.repo.206092017-01-16T08:22:38Z http://repo.uum.edu.my/20609/ Implementation of improved Levenshtein algorithm for spelling correction word candidate list generation Abdulkhudur, Hanan Najm Habeeb, Imad Q. Yusof, Yuhanis Mohd Yusof, Shahrul Azmi QA75 Electronic computers. Computer science Candidates’ list generation in spelling correction is a process of finding words from a lexicon that are close to the incorrect word. The most widely used algorithm to generate the candidate list is the Levenshtein algorithm. However, the algorithm consumes high computational cost, especially when there is a large number of spelling errors. The reason is that calculating Levenshtein algorithm includes operations that create an array and fill the cells of this array by comparing the characters of an incorrect word with the characters of a word from a lexicon. Since most lexicons contain millions of words, such operations will be repeated millions of times for each incorrect word in order to generate its candidates’ list. This study proposes an improved Levenshtein algorithm that reduces the operation steps in comparing characters between the query and lexicon words. Experimental results show that the proposed algorithm outperformed the Levenshtein algorithm in terms of processing time by having 32.43% percentage decrease. 2016-06-30 Article PeerReviewed application/pdf en http://repo.uum.edu.my/20609/1/JTAIT%2088%203%202016%20449%20455.pdf Abdulkhudur, Hanan Najm and Habeeb, Imad Q. and Yusof, Yuhanis and Mohd Yusof, Shahrul Azmi (2016) Implementation of improved Levenshtein algorithm for spelling correction word candidate list generation. Journal of Theoretical and Applied Information Technology, 88 (3). pp. 449-455. ISSN 1992-8645 http://www.jatit.org/volumes/Vol88No3/10Vol88No3.pdf |
institution |
Universiti Utara Malaysia |
building |
UUM Library |
collection |
Institutional Repository |
continent |
Asia |
country |
Malaysia |
content_provider |
Universiti Utara Malaysia |
content_source |
UUM Institutionali Repository |
url_provider |
http://repo.uum.edu.my/ |
language |
English |
topic |
QA75 Electronic computers. Computer science |
spellingShingle |
QA75 Electronic computers. Computer science Abdulkhudur, Hanan Najm Habeeb, Imad Q. Yusof, Yuhanis Mohd Yusof, Shahrul Azmi Implementation of improved Levenshtein algorithm for spelling correction word candidate list generation |
description |
Candidates’ list generation in spelling correction is a process of finding words from a lexicon that are close to the incorrect word. The most widely used algorithm to generate the candidate list is the Levenshtein algorithm. However, the algorithm consumes high computational cost, especially when there is a large number of spelling errors. The reason is that calculating Levenshtein algorithm includes operations that create an array and fill the cells of this array by comparing the characters of an incorrect word with the characters of a word from a lexicon. Since most lexicons contain millions of words, such operations will be repeated millions of times for each incorrect word in order to generate its candidates’ list. This study proposes an improved Levenshtein algorithm that reduces the operation steps in comparing characters between the query and lexicon words. Experimental results show that the proposed algorithm outperformed the Levenshtein algorithm in terms of processing time by having 32.43% percentage decrease. |
format |
Article |
author |
Abdulkhudur, Hanan Najm Habeeb, Imad Q. Yusof, Yuhanis Mohd Yusof, Shahrul Azmi |
author_facet |
Abdulkhudur, Hanan Najm Habeeb, Imad Q. Yusof, Yuhanis Mohd Yusof, Shahrul Azmi |
author_sort |
Abdulkhudur, Hanan Najm |
title |
Implementation of improved Levenshtein algorithm for spelling correction word candidate list generation |
title_short |
Implementation of improved Levenshtein algorithm for spelling correction word candidate list generation |
title_full |
Implementation of improved Levenshtein algorithm for spelling correction word candidate list generation |
title_fullStr |
Implementation of improved Levenshtein algorithm for spelling correction word candidate list generation |
title_full_unstemmed |
Implementation of improved Levenshtein algorithm for spelling correction word candidate list generation |
title_sort |
implementation of improved levenshtein algorithm for spelling correction word candidate list generation |
publishDate |
2016 |
url |
http://repo.uum.edu.my/20609/1/JTAIT%2088%203%202016%20449%20455.pdf http://repo.uum.edu.my/20609/ http://www.jatit.org/volumes/Vol88No3/10Vol88No3.pdf |
_version_ |
1644283004960374784 |
score |
13.211869 |