The Effectiveness of DTW-FF Coefficients and Pitch Feature in NN Speech Recognition

This paper presents a method to extract speech features contained in the dynamic time warping path which originally was derived from linear predictive coding (LPC). For the purpose of recognition, the extracted feature will represent the inputs into neural network back-propagation. The new method p...

Full description

Saved in:
Bibliographic Details
Main Authors: Sudirman, Rubita, Salleh, Sh-Hussain, Salleh, Shaharuddin
Format: Conference or Workshop Item
Language:English
Published: 2006
Subjects:
Online Access:http://eprints.utm.my/id/eprint/1971/1/rubita06_Effectiveness_of_DTWFF_coefficients.pdf
http://eprints.utm.my/id/eprint/1971/
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.utm.1971
record_format eprints
spelling my.utm.19712017-08-30T04:15:25Z http://eprints.utm.my/id/eprint/1971/ The Effectiveness of DTW-FF Coefficients and Pitch Feature in NN Speech Recognition Sudirman, Rubita Salleh, Sh-Hussain Salleh, Shaharuddin TK Electrical engineering. Electronics Nuclear engineering This paper presents a method to extract speech features contained in the dynamic time warping path which originally was derived from linear predictive coding (LPC). For the purpose of recognition, the extracted feature will represent the inputs into neural network back-propagation. The new method presented here is how the feature is extracted and those coefficients are normalized against the template pattern according to the selected average number of frames over the samples collected. The idea behind this method is due to neural network (NN) limitation where a fixed amount of input nodes are needed for every input class especially in the application of multiple inputs. Thus, the main objective of this research is to find an alternative method to reduce the amount of computation and complexity in a neural network, in this case is for speech recognition. One way to achieve this is by reducing the number of inputs into the network. This is done through dynamic warping process in which local distance scores of the warping path will be utilized instead of the global distance scores. From the literature review, past and most current research are using the global distance score or LPC coefficients as input to the neural network. LPC certainly presented into the network with a large amount of coefficients in each speech frame. 2006 Conference or Workshop Item PeerReviewed application/pdf en http://eprints.utm.my/id/eprint/1971/1/rubita06_Effectiveness_of_DTWFF_coefficients.pdf Sudirman, Rubita and Salleh, Sh-Hussain and Salleh, Shaharuddin (2006) The Effectiveness of DTW-FF Coefficients and Pitch Feature in NN Speech Recognition. In: Proceeding of the International Conference on Artificial Intelligence, Engineering and Technology , 22-24 November 2006, Kota Kinabalu, Sabah.
institution Universiti Teknologi Malaysia
building UTM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Teknologi Malaysia
content_source UTM Institutional Repository
url_provider http://eprints.utm.my/
language English
topic TK Electrical engineering. Electronics Nuclear engineering
spellingShingle TK Electrical engineering. Electronics Nuclear engineering
Sudirman, Rubita
Salleh, Sh-Hussain
Salleh, Shaharuddin
The Effectiveness of DTW-FF Coefficients and Pitch Feature in NN Speech Recognition
description This paper presents a method to extract speech features contained in the dynamic time warping path which originally was derived from linear predictive coding (LPC). For the purpose of recognition, the extracted feature will represent the inputs into neural network back-propagation. The new method presented here is how the feature is extracted and those coefficients are normalized against the template pattern according to the selected average number of frames over the samples collected. The idea behind this method is due to neural network (NN) limitation where a fixed amount of input nodes are needed for every input class especially in the application of multiple inputs. Thus, the main objective of this research is to find an alternative method to reduce the amount of computation and complexity in a neural network, in this case is for speech recognition. One way to achieve this is by reducing the number of inputs into the network. This is done through dynamic warping process in which local distance scores of the warping path will be utilized instead of the global distance scores. From the literature review, past and most current research are using the global distance score or LPC coefficients as input to the neural network. LPC certainly presented into the network with a large amount of coefficients in each speech frame.
format Conference or Workshop Item
author Sudirman, Rubita
Salleh, Sh-Hussain
Salleh, Shaharuddin
author_facet Sudirman, Rubita
Salleh, Sh-Hussain
Salleh, Shaharuddin
author_sort Sudirman, Rubita
title The Effectiveness of DTW-FF Coefficients and Pitch Feature in NN Speech Recognition
title_short The Effectiveness of DTW-FF Coefficients and Pitch Feature in NN Speech Recognition
title_full The Effectiveness of DTW-FF Coefficients and Pitch Feature in NN Speech Recognition
title_fullStr The Effectiveness of DTW-FF Coefficients and Pitch Feature in NN Speech Recognition
title_full_unstemmed The Effectiveness of DTW-FF Coefficients and Pitch Feature in NN Speech Recognition
title_sort effectiveness of dtw-ff coefficients and pitch feature in nn speech recognition
publishDate 2006
url http://eprints.utm.my/id/eprint/1971/1/rubita06_Effectiveness_of_DTWFF_coefficients.pdf
http://eprints.utm.my/id/eprint/1971/
_version_ 1643643466645766144
score 13.160551