Staff View: Testing Sphinx’s language model fault-tolerance for the Holy Quran

Testing Sphinx’s language model fault-tolerance for the Holy Quran

The Carnegie Mellon University’s (CMU) Sphinx framework is increasingly used for the Arabic speech recognition in general and applied to the Holy Quran in particular. Generating the language model includes a tedious task of preparing the transcriptions for all the data. In this paper, we investigat...

Full description

Saved in:

Bibliographic Details
Main Authors:	El Amrani, Mohamed Yassine, Rahman, M.M. Hafizur, Wahiddin, Mohamed Ridza, Shah, Asadullah
Format:	Conference or Workshop Item
Language:	English English
Published:	The Institute of Electrical and Electronics Engineers, Inc. 2017
Subjects:	TK7800 Electronics. Computer engineering. Computer hardware. Photoelectronic devices
Online Access:	http://irep.iium.edu.my/54937/1/54893_A%20Practical%20and%20Interactive%20%20Web-based.pdf http://irep.iium.edu.my/54937/12/54937_Testing%20Sphinx%E2%80%99s%20language_scopus.pdf http://irep.iium.edu.my/54937/ http://ieeexplore.ieee.org/document/7814882/
Tags:	Add Tag No Tags, Be the first to tag this record!

id	my.iium.irep.54937
record_format	dspace
spelling	my.iium.irep.549372018-02-04T06:49:47Z http://irep.iium.edu.my/54937/ Testing Sphinx’s language model fault-tolerance for the Holy Quran El Amrani, Mohamed Yassine Rahman, M.M. Hafizur Wahiddin, Mohamed Ridza Shah, Asadullah TK7800 Electronics. Computer engineering. Computer hardware. Photoelectronic devices The Carnegie Mellon University’s (CMU) Sphinx framework is increasingly used for the Arabic speech recognition in general and applied to the Holy Quran in particular. Generating the language model includes a tedious task of preparing the transcriptions for all the data. In this paper, we investigate the fault-tolerance of the automatically generated language model as compared to a corrected and uncorrected transcription with and without silence tagging. This editing addresses the different repetitions and pauses encountered during recitations. Experiments show that the average difference between the lowest and highest Word Error Rate (WER) for each configuration of the number of Senones is 0.6% when using all files for the training and 1.6% when using 80% of the files for training the language model of 17 chapters of the Holy Quran. Results show that the performance of trained models without any correction can be close to when all required rectifications of transcriptions are performed. The Institute of Electrical and Electronics Engineers, Inc. 2017-01-16 Conference or Workshop Item REM application/pdf en http://irep.iium.edu.my/54937/1/54893_A%20Practical%20and%20Interactive%20%20Web-based.pdf application/pdf en http://irep.iium.edu.my/54937/12/54937_Testing%20Sphinx%E2%80%99s%20language_scopus.pdf El Amrani, Mohamed Yassine and Rahman, M.M. Hafizur and Wahiddin, Mohamed Ridza and Shah, Asadullah (2017) Testing Sphinx’s language model fault-tolerance for the Holy Quran. In: 6th International Conference on Information and Communication Technology for the Muslim World (ICT4M 2016), 22nd-24th November 2016, Jakarta, Indonesia. http://ieeexplore.ieee.org/document/7814882/ 10.1109/ICT4M.2016.27
institution	Universiti Islam Antarabangsa Malaysia
building	IIUM Library
collection	Institutional Repository
continent	Asia
country	Malaysia
content_provider	International Islamic University Malaysia
content_source	IIUM Repository (IREP)
url_provider	http://irep.iium.edu.my/
language	English English
topic	TK7800 Electronics. Computer engineering. Computer hardware. Photoelectronic devices
spellingShingle	TK7800 Electronics. Computer engineering. Computer hardware. Photoelectronic devices El Amrani, Mohamed Yassine Rahman, M.M. Hafizur Wahiddin, Mohamed Ridza Shah, Asadullah Testing Sphinx’s language model fault-tolerance for the Holy Quran
description	The Carnegie Mellon University’s (CMU) Sphinx framework is increasingly used for the Arabic speech recognition in general and applied to the Holy Quran in particular. Generating the language model includes a tedious task of preparing the transcriptions for all the data. In this paper, we investigate the fault-tolerance of the automatically generated language model as compared to a corrected and uncorrected transcription with and without silence tagging. This editing addresses the different repetitions and pauses encountered during recitations. Experiments show that the average difference between the lowest and highest Word Error Rate (WER) for each configuration of the number of Senones is 0.6% when using all files for the training and 1.6% when using 80% of the files for training the language model of 17 chapters of the Holy Quran. Results show that the performance of trained models without any correction can be close to when all required rectifications of transcriptions are performed.
format	Conference or Workshop Item
author	El Amrani, Mohamed Yassine Rahman, M.M. Hafizur Wahiddin, Mohamed Ridza Shah, Asadullah
author_facet	El Amrani, Mohamed Yassine Rahman, M.M. Hafizur Wahiddin, Mohamed Ridza Shah, Asadullah
author_sort	El Amrani, Mohamed Yassine
title	Testing Sphinx’s language model fault-tolerance for the Holy Quran
title_short	Testing Sphinx’s language model fault-tolerance for the Holy Quran
title_full	Testing Sphinx’s language model fault-tolerance for the Holy Quran
title_fullStr	Testing Sphinx’s language model fault-tolerance for the Holy Quran
title_full_unstemmed	Testing Sphinx’s language model fault-tolerance for the Holy Quran
title_sort	testing sphinx’s language model fault-tolerance for the holy quran
publisher	The Institute of Electrical and Electronics Engineers, Inc.
publishDate	2017
url	http://irep.iium.edu.my/54937/1/54893_A%20Practical%20and%20Interactive%20%20Web-based.pdf http://irep.iium.edu.my/54937/12/54937_Testing%20Sphinx%E2%80%99s%20language_scopus.pdf http://irep.iium.edu.my/54937/ http://ieeexplore.ieee.org/document/7814882/
_version_	1643614646064644096
score	13.189025

Testing Sphinx’s language model fault-tolerance for the Holy Quran

Similar Items