DTWFF-pitch feature and faster neural network convergence for speech recognition

This paper presents the pre-processing of speech templates for artificial neural network (ANN). The processed features are pitch and Linear Predictive Coefficients (LPC) for input and reference templates, based on Dynamic Time Warping (DTW) algorithm. The first task is to extract pitch features usin...

Full description

Saved in:
Bibliographic Details
Main Authors: Sudirman, Rubita, Salleh, Sh. Hussain, Salleh, Shaharuddin
Format: Article
Language:English
English
Published: Faculty of Electrical Engineering 2007
Subjects:
Online Access:http://eprints.utm.my/id/eprint/8061/3/RubitaSudirman2007_DTWFFPitchFeatureandFasterNeural.pdf
http://eprints.utm.my/id/eprint/8061/4/564200.pdf
http://eprints.utm.my/id/eprint/8061/
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.utm.8061
record_format eprints
spelling my.utm.80612013-11-28T03:19:08Z http://eprints.utm.my/id/eprint/8061/ DTWFF-pitch feature and faster neural network convergence for speech recognition Sudirman, Rubita Salleh, Sh. Hussain Salleh, Shaharuddin TK Electrical engineering. Electronics Nuclear engineering This paper presents the pre-processing of speech templates for artificial neural network (ANN). The processed features are pitch and Linear Predictive Coefficients (LPC) for input and reference templates, based on Dynamic Time Warping (DTW) algorithm. The first task is to extract pitch features using Pitch Scale Harmonic Filter algorithm. Another task is to align the input frames (test set) to the reference template (training set) using DTW fixing frame (DTW-FF) algorithm. This is a time normalization process in which it is needed for data with unequal length. By doing time normalization, the test set and the training set are adjusted to the same number of frames. Having both pitch and LPC features fixed frames, speech recognition using neural network can be performed. A high recognition rate is obtained using combined features of DTW-FF and pitch for Malay digit words of 0-9, as high as 100% is achieved. Another task included in this paper is to find the optimal global minimum of the NN surface using the conjugate gradient algorithm to replace the steepest gradient descent in the back-propagation algorithm. Results showed that conjugate gradient algorithm is able to find a better optimal global minimum. Faculty of Electrical Engineering 2007-06 Article PeerReviewed application/pdf en http://eprints.utm.my/id/eprint/8061/3/RubitaSudirman2007_DTWFFPitchFeatureandFasterNeural.pdf text/html en http://eprints.utm.my/id/eprint/8061/4/564200.pdf Sudirman, Rubita and Salleh, Sh. Hussain and Salleh, Shaharuddin (2007) DTWFF-pitch feature and faster neural network convergence for speech recognition. Elektrika, 9 (1). pp. 9-13. ISSN 0128-4428
institution Universiti Teknologi Malaysia
building UTM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Teknologi Malaysia
content_source UTM Institutional Repository
url_provider http://eprints.utm.my/
language English
English
topic TK Electrical engineering. Electronics Nuclear engineering
spellingShingle TK Electrical engineering. Electronics Nuclear engineering
Sudirman, Rubita
Salleh, Sh. Hussain
Salleh, Shaharuddin
DTWFF-pitch feature and faster neural network convergence for speech recognition
description This paper presents the pre-processing of speech templates for artificial neural network (ANN). The processed features are pitch and Linear Predictive Coefficients (LPC) for input and reference templates, based on Dynamic Time Warping (DTW) algorithm. The first task is to extract pitch features using Pitch Scale Harmonic Filter algorithm. Another task is to align the input frames (test set) to the reference template (training set) using DTW fixing frame (DTW-FF) algorithm. This is a time normalization process in which it is needed for data with unequal length. By doing time normalization, the test set and the training set are adjusted to the same number of frames. Having both pitch and LPC features fixed frames, speech recognition using neural network can be performed. A high recognition rate is obtained using combined features of DTW-FF and pitch for Malay digit words of 0-9, as high as 100% is achieved. Another task included in this paper is to find the optimal global minimum of the NN surface using the conjugate gradient algorithm to replace the steepest gradient descent in the back-propagation algorithm. Results showed that conjugate gradient algorithm is able to find a better optimal global minimum.
format Article
author Sudirman, Rubita
Salleh, Sh. Hussain
Salleh, Shaharuddin
author_facet Sudirman, Rubita
Salleh, Sh. Hussain
Salleh, Shaharuddin
author_sort Sudirman, Rubita
title DTWFF-pitch feature and faster neural network convergence for speech recognition
title_short DTWFF-pitch feature and faster neural network convergence for speech recognition
title_full DTWFF-pitch feature and faster neural network convergence for speech recognition
title_fullStr DTWFF-pitch feature and faster neural network convergence for speech recognition
title_full_unstemmed DTWFF-pitch feature and faster neural network convergence for speech recognition
title_sort dtwff-pitch feature and faster neural network convergence for speech recognition
publisher Faculty of Electrical Engineering
publishDate 2007
url http://eprints.utm.my/id/eprint/8061/3/RubitaSudirman2007_DTWFFPitchFeatureandFasterNeural.pdf
http://eprints.utm.my/id/eprint/8061/4/564200.pdf
http://eprints.utm.my/id/eprint/8061/
_version_ 1643644912467443712
score 13.160551