Building CMU Sphinx language model for the Holy Quran using simplified Arabic phonemes

This paper investigates the use of a simplified set of Arabic phonemes in an Arabic Speech Recognition system applied to Holy Quran. The CMU Sphinx 4 was used to train and evaluate a language model for the Hafs narration of the Holy Quran. The building of the language model was done using a simplifi...

Full description

Saved in:
Bibliographic Details
Main Authors: El Amrani, Mohamed Yassine, Rahman, M.M. Hafizur, Wahiddin, Mohamed Ridza, Shah, Asadullah
Format: Article
Language:English
English
Published: Elsevier 2016
Subjects:
Online Access:http://irep.iium.edu.my/53574/1/EIJ_Pub.pdf
http://irep.iium.edu.my/53574/7/53574_building%20CMU%20Sphinx%20language_scopus.pdf
http://irep.iium.edu.my/53574/
http://www.sciencedirect.com/science/article/pii/S1110866516300123
http://dx.doi.org/10.1016/j.eij.2016.04.002
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.iium.irep.53574
record_format dspace
spelling my.iium.irep.535742017-01-11T03:01:05Z http://irep.iium.edu.my/53574/ Building CMU Sphinx language model for the Holy Quran using simplified Arabic phonemes El Amrani, Mohamed Yassine Rahman, M.M. Hafizur Wahiddin, Mohamed Ridza Shah, Asadullah TK7800 Electronics. Computer engineering. Computer hardware. Photoelectronic devices This paper investigates the use of a simplified set of Arabic phonemes in an Arabic Speech Recognition system applied to Holy Quran. The CMU Sphinx 4 was used to train and evaluate a language model for the Hafs narration of the Holy Quran. The building of the language model was done using a simplified list of Arabic phonemes instead of the mainly used Romanized set in order to simplify the process of generating the language model. The experiments resulted in very low Word Error Rate (WER) reaching 1.5% while using a very small set of audio files during the training phase when using all the audio data for both the training and the testing phases. However, when using 90% and 80% of the training data, the WER obtained was respectively 50.0% and 55.7%. Elsevier 2016-11-01 Article REM application/pdf en http://irep.iium.edu.my/53574/1/EIJ_Pub.pdf application/pdf en http://irep.iium.edu.my/53574/7/53574_building%20CMU%20Sphinx%20language_scopus.pdf El Amrani, Mohamed Yassine and Rahman, M.M. Hafizur and Wahiddin, Mohamed Ridza and Shah, Asadullah (2016) Building CMU Sphinx language model for the Holy Quran using simplified Arabic phonemes. Egyptian Informatics Journal, 17 (3). pp. 305-314. ISSN 1110-8665 http://www.sciencedirect.com/science/article/pii/S1110866516300123 http://dx.doi.org/10.1016/j.eij.2016.04.002
institution Universiti Islam Antarabangsa Malaysia
building IIUM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider International Islamic University Malaysia
content_source IIUM Repository (IREP)
url_provider http://irep.iium.edu.my/
language English
English
topic TK7800 Electronics. Computer engineering. Computer hardware. Photoelectronic devices
spellingShingle TK7800 Electronics. Computer engineering. Computer hardware. Photoelectronic devices
El Amrani, Mohamed Yassine
Rahman, M.M. Hafizur
Wahiddin, Mohamed Ridza
Shah, Asadullah
Building CMU Sphinx language model for the Holy Quran using simplified Arabic phonemes
description This paper investigates the use of a simplified set of Arabic phonemes in an Arabic Speech Recognition system applied to Holy Quran. The CMU Sphinx 4 was used to train and evaluate a language model for the Hafs narration of the Holy Quran. The building of the language model was done using a simplified list of Arabic phonemes instead of the mainly used Romanized set in order to simplify the process of generating the language model. The experiments resulted in very low Word Error Rate (WER) reaching 1.5% while using a very small set of audio files during the training phase when using all the audio data for both the training and the testing phases. However, when using 90% and 80% of the training data, the WER obtained was respectively 50.0% and 55.7%.
format Article
author El Amrani, Mohamed Yassine
Rahman, M.M. Hafizur
Wahiddin, Mohamed Ridza
Shah, Asadullah
author_facet El Amrani, Mohamed Yassine
Rahman, M.M. Hafizur
Wahiddin, Mohamed Ridza
Shah, Asadullah
author_sort El Amrani, Mohamed Yassine
title Building CMU Sphinx language model for the Holy Quran using simplified Arabic phonemes
title_short Building CMU Sphinx language model for the Holy Quran using simplified Arabic phonemes
title_full Building CMU Sphinx language model for the Holy Quran using simplified Arabic phonemes
title_fullStr Building CMU Sphinx language model for the Holy Quran using simplified Arabic phonemes
title_full_unstemmed Building CMU Sphinx language model for the Holy Quran using simplified Arabic phonemes
title_sort building cmu sphinx language model for the holy quran using simplified arabic phonemes
publisher Elsevier
publishDate 2016
url http://irep.iium.edu.my/53574/1/EIJ_Pub.pdf
http://irep.iium.edu.my/53574/7/53574_building%20CMU%20Sphinx%20language_scopus.pdf
http://irep.iium.edu.my/53574/
http://www.sciencedirect.com/science/article/pii/S1110866516300123
http://dx.doi.org/10.1016/j.eij.2016.04.002
_version_ 1643614373085708288
score 13.188475