The Effectiveness of DTW-FF Coefficients and Pitch Feature in NN Speech Recognition
This paper presents a method to extract speech features contained in the dynamic time warping path which originally was derived from linear predictive coding (LPC). For the purpose of recognition, the extracted feature will represent the inputs into neural network back-propagation. The new method p...
Saved in:
Main Authors: | , , |
---|---|
Format: | Conference or Workshop Item |
Language: | English |
Published: |
2006
|
Subjects: | |
Online Access: | http://eprints.utm.my/id/eprint/1971/1/rubita06_Effectiveness_of_DTWFF_coefficients.pdf http://eprints.utm.my/id/eprint/1971/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
id |
my.utm.1971 |
---|---|
record_format |
eprints |
spelling |
my.utm.19712017-08-30T04:15:25Z http://eprints.utm.my/id/eprint/1971/ The Effectiveness of DTW-FF Coefficients and Pitch Feature in NN Speech Recognition Sudirman, Rubita Salleh, Sh-Hussain Salleh, Shaharuddin TK Electrical engineering. Electronics Nuclear engineering This paper presents a method to extract speech features contained in the dynamic time warping path which originally was derived from linear predictive coding (LPC). For the purpose of recognition, the extracted feature will represent the inputs into neural network back-propagation. The new method presented here is how the feature is extracted and those coefficients are normalized against the template pattern according to the selected average number of frames over the samples collected. The idea behind this method is due to neural network (NN) limitation where a fixed amount of input nodes are needed for every input class especially in the application of multiple inputs. Thus, the main objective of this research is to find an alternative method to reduce the amount of computation and complexity in a neural network, in this case is for speech recognition. One way to achieve this is by reducing the number of inputs into the network. This is done through dynamic warping process in which local distance scores of the warping path will be utilized instead of the global distance scores. From the literature review, past and most current research are using the global distance score or LPC coefficients as input to the neural network. LPC certainly presented into the network with a large amount of coefficients in each speech frame. 2006 Conference or Workshop Item PeerReviewed application/pdf en http://eprints.utm.my/id/eprint/1971/1/rubita06_Effectiveness_of_DTWFF_coefficients.pdf Sudirman, Rubita and Salleh, Sh-Hussain and Salleh, Shaharuddin (2006) The Effectiveness of DTW-FF Coefficients and Pitch Feature in NN Speech Recognition. In: Proceeding of the International Conference on Artificial Intelligence, Engineering and Technology , 22-24 November 2006, Kota Kinabalu, Sabah. |
institution |
Universiti Teknologi Malaysia |
building |
UTM Library |
collection |
Institutional Repository |
continent |
Asia |
country |
Malaysia |
content_provider |
Universiti Teknologi Malaysia |
content_source |
UTM Institutional Repository |
url_provider |
http://eprints.utm.my/ |
language |
English |
topic |
TK Electrical engineering. Electronics Nuclear engineering |
spellingShingle |
TK Electrical engineering. Electronics Nuclear engineering Sudirman, Rubita Salleh, Sh-Hussain Salleh, Shaharuddin The Effectiveness of DTW-FF Coefficients and Pitch Feature in NN Speech Recognition |
description |
This paper presents a method to extract speech features
contained in the dynamic time warping path which originally was derived from linear predictive coding (LPC). For the purpose of recognition, the extracted feature will represent the inputs into neural network back-propagation. The new method presented here is how the feature is extracted and those coefficients are normalized against the template pattern according to the selected average number of frames over the samples collected. The idea behind this method is due to neural network (NN) limitation where a fixed amount of input nodes are needed for every input class especially in the application of multiple inputs. Thus, the main objective of this research is to find an alternative method to reduce the amount of computation and complexity in a neural network, in this case is for speech recognition. One way to achieve this is by reducing the number of inputs into the network. This is done through dynamic warping process in which local distance scores of the warping path will be utilized instead of the global distance scores. From the literature review, past and most current research are using the global distance score or LPC coefficients as input to the neural network. LPC certainly presented into the network with a large amount of coefficients in each speech frame. |
format |
Conference or Workshop Item |
author |
Sudirman, Rubita Salleh, Sh-Hussain Salleh, Shaharuddin |
author_facet |
Sudirman, Rubita Salleh, Sh-Hussain Salleh, Shaharuddin |
author_sort |
Sudirman, Rubita |
title |
The Effectiveness of DTW-FF Coefficients and Pitch Feature in NN Speech Recognition |
title_short |
The Effectiveness of DTW-FF Coefficients and Pitch Feature in NN Speech Recognition |
title_full |
The Effectiveness of DTW-FF Coefficients and Pitch Feature in NN Speech Recognition |
title_fullStr |
The Effectiveness of DTW-FF Coefficients and Pitch Feature in NN Speech Recognition |
title_full_unstemmed |
The Effectiveness of DTW-FF Coefficients and Pitch Feature in NN Speech Recognition |
title_sort |
effectiveness of dtw-ff coefficients and pitch feature in nn speech recognition |
publishDate |
2006 |
url |
http://eprints.utm.my/id/eprint/1971/1/rubita06_Effectiveness_of_DTWFF_coefficients.pdf http://eprints.utm.my/id/eprint/1971/ |
_version_ |
1643643466645766144 |
score |
13.214268 |