Comparative study on Malay children vowel recognition using multi-layer perceptron and recurrent neural networks / Afshan Kordi

Speech recognition has become popular during recent decades due to its widespread applications such as telephone systems, health care domain, data entry, speech to text processing, biometric systems, training air traffic controllers and so on. Among the technologies that have been investigated in ac...

Full description

Saved in:
Bibliographic Details
Main Author: Afshan, Kordi
Format: Thesis
Published: 2012
Subjects:
Online Access:http://studentsrepo.um.edu.my/7715/5/afshan.pdf
http://studentsrepo.um.edu.my/7715/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Speech recognition has become popular during recent decades due to its widespread applications such as telephone systems, health care domain, data entry, speech to text processing, biometric systems, training air traffic controllers and so on. Among the technologies that have been investigated in acoustic modeling of speech, Artificial Neural Networks (ANN) have received interest from many researchers as they have shown good results in pattern recognition specially in classification. Despite of noteworthy progress in speech classification using neural networks, some unresolved issues still are remained in utilizing and performing the neural networks. Particularly less effort has been done on the speech of children which is more dynamic. There are numerous neural network architectures introduced by scientists that the most common sufficient for speech recognition include: Multi-Layer Perceptron (MLP) and Recurrent Neural Network (RNN). The purpose of this study is to compare the performance and recognition rate of these two types of neural networks in terms of signal length and number of hidden neurons for sustained Malay vowel among Malay children. Linear Predictive Coding (LPC) is used as a feature extractor to convert the speech signal into parametric coefficients. The Neural Network Toolbox™ (nntool) in Matlab® is used to classify the six Malay vowels (/a/, /e/ /ә/, /i/, /o/ and /u/) according to the 3-fold cross validation technique in different signal lengths with different number of hidden neurons. Experiments were done to compare the performance of the neural networks using single frame and multiple frame approach as well. The results show that longer signal lengths perform better than those in short signal lengths. The findings indicate that MLP and RNN reached a recognition rate of 83.79% and 83.10% respectively. Vowel /i/ got the highest recognition rate in both methods.