Implementation of improved Levenshtein algorithm for spelling correction word candidate list generation

Candidates’ list generation in spelling correction is a process of finding words from a lexicon that are close to the incorrect word. The most widely used algorithm to generate the candidate list is the Levenshtein algorithm. However, the algorithm consumes high computational cost, especially when t...

Full description

Saved in:
Bibliographic Details
Main Authors: Abdulkhudur, Hanan Najm, Habeeb, Imad Q., Yusof, Yuhanis, Mohd Yusof, Shahrul Azmi
Format: Article
Language:English
Published: 2016
Subjects:
Online Access:http://repo.uum.edu.my/20609/1/JTAIT%2088%203%202016%20449%20455.pdf
http://repo.uum.edu.my/20609/
http://www.jatit.org/volumes/Vol88No3/10Vol88No3.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.uum.repo.20609
record_format eprints
spelling my.uum.repo.206092017-01-16T08:22:38Z http://repo.uum.edu.my/20609/ Implementation of improved Levenshtein algorithm for spelling correction word candidate list generation Abdulkhudur, Hanan Najm Habeeb, Imad Q. Yusof, Yuhanis Mohd Yusof, Shahrul Azmi QA75 Electronic computers. Computer science Candidates’ list generation in spelling correction is a process of finding words from a lexicon that are close to the incorrect word. The most widely used algorithm to generate the candidate list is the Levenshtein algorithm. However, the algorithm consumes high computational cost, especially when there is a large number of spelling errors. The reason is that calculating Levenshtein algorithm includes operations that create an array and fill the cells of this array by comparing the characters of an incorrect word with the characters of a word from a lexicon. Since most lexicons contain millions of words, such operations will be repeated millions of times for each incorrect word in order to generate its candidates’ list. This study proposes an improved Levenshtein algorithm that reduces the operation steps in comparing characters between the query and lexicon words. Experimental results show that the proposed algorithm outperformed the Levenshtein algorithm in terms of processing time by having 32.43% percentage decrease. 2016-06-30 Article PeerReviewed application/pdf en http://repo.uum.edu.my/20609/1/JTAIT%2088%203%202016%20449%20455.pdf Abdulkhudur, Hanan Najm and Habeeb, Imad Q. and Yusof, Yuhanis and Mohd Yusof, Shahrul Azmi (2016) Implementation of improved Levenshtein algorithm for spelling correction word candidate list generation. Journal of Theoretical and Applied Information Technology, 88 (3). pp. 449-455. ISSN 1992-8645 http://www.jatit.org/volumes/Vol88No3/10Vol88No3.pdf
institution Universiti Utara Malaysia
building UUM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Utara Malaysia
content_source UUM Institutionali Repository
url_provider http://repo.uum.edu.my/
language English
topic QA75 Electronic computers. Computer science
spellingShingle QA75 Electronic computers. Computer science
Abdulkhudur, Hanan Najm
Habeeb, Imad Q.
Yusof, Yuhanis
Mohd Yusof, Shahrul Azmi
Implementation of improved Levenshtein algorithm for spelling correction word candidate list generation
description Candidates’ list generation in spelling correction is a process of finding words from a lexicon that are close to the incorrect word. The most widely used algorithm to generate the candidate list is the Levenshtein algorithm. However, the algorithm consumes high computational cost, especially when there is a large number of spelling errors. The reason is that calculating Levenshtein algorithm includes operations that create an array and fill the cells of this array by comparing the characters of an incorrect word with the characters of a word from a lexicon. Since most lexicons contain millions of words, such operations will be repeated millions of times for each incorrect word in order to generate its candidates’ list. This study proposes an improved Levenshtein algorithm that reduces the operation steps in comparing characters between the query and lexicon words. Experimental results show that the proposed algorithm outperformed the Levenshtein algorithm in terms of processing time by having 32.43% percentage decrease.
format Article
author Abdulkhudur, Hanan Najm
Habeeb, Imad Q.
Yusof, Yuhanis
Mohd Yusof, Shahrul Azmi
author_facet Abdulkhudur, Hanan Najm
Habeeb, Imad Q.
Yusof, Yuhanis
Mohd Yusof, Shahrul Azmi
author_sort Abdulkhudur, Hanan Najm
title Implementation of improved Levenshtein algorithm for spelling correction word candidate list generation
title_short Implementation of improved Levenshtein algorithm for spelling correction word candidate list generation
title_full Implementation of improved Levenshtein algorithm for spelling correction word candidate list generation
title_fullStr Implementation of improved Levenshtein algorithm for spelling correction word candidate list generation
title_full_unstemmed Implementation of improved Levenshtein algorithm for spelling correction word candidate list generation
title_sort implementation of improved levenshtein algorithm for spelling correction word candidate list generation
publishDate 2016
url http://repo.uum.edu.my/20609/1/JTAIT%2088%203%202016%20449%20455.pdf
http://repo.uum.edu.my/20609/
http://www.jatit.org/volumes/Vol88No3/10Vol88No3.pdf
_version_ 1644283004960374784
score 13.211869