Staff View: NN with DTW-FF Coefficients and Pitch Feature for Speaker Recognition

NN with DTW-FF Coefficients and Pitch Feature for Speaker Recognition

This paper proposes a new method to extract speech features in a warping path using dynamic programming (DP). The new method presented in this paper described how the LPC feature is extracted and those coefficients are normalized against the template pattern according to the selected average number...

Full description

Saved in:

Bibliographic Details
Main Authors:	Sudirman, Rubita, Salleh, Sh-Hussain, Salleh, Shaharuddin
Format:	Article
Language:	English
Published:	School of Postgraduate Studies, UTM 2006
Subjects:	TK Electrical engineering. Electronics Nuclear engineering
Online Access:	http://eprints.utm.my/id/eprint/1655/1/rubita05_NN_with_DTWFF.pdf http://eprints.utm.my/id/eprint/1655/
Tags:	Add Tag No Tags, Be the first to tag this record!

id	my.utm.1655
record_format	eprints
spelling	my.utm.16552010-06-01T02:56:57Z http://eprints.utm.my/id/eprint/1655/ NN with DTW-FF Coefficients and Pitch Feature for Speaker Recognition Sudirman, Rubita Salleh, Sh-Hussain Salleh, Shaharuddin TK Electrical engineering. Electronics Nuclear engineering This paper proposes a new method to extract speech features in a warping path using dynamic programming (DP). The new method presented in this paper described how the LPC feature is extracted and those coefficients are normalized against the template pattern according to the selected average number of frames over the samples collected. The idea behind this method is due to neural network (NN) limitation where a fixed amount of input nodes is needed for every input class especially in the application of multiple inputs. The new feature processing used the modified version of traditional DTW called as DTW-FF algorithm to fix the input size so that the source and template frames have equal number of frames. Then the DTW-FF coefficients are retained and later being used as inputs into the MLP neural network training and testing. Thus, the main objective of this research is to find an alternative method to reduce the amount of computation and complexity in a neural network for speaker recognition which can be done by reducing the number of inputs into the network by using warping process, so the local distance scores of the warping path will be utilized instead of the global distance scores. The speaker recognition is performed using the back-propagation neural network (BPNN) algorithm to enhance the recognition performance. The results compare DTW using LPC coefficients to BPNN with DTW-FF coefficients; BPNN with DTW-FF coefficients shows a higher recognition rate than DTW with LPC coefficients. The last task is to introduce another input feature into the neural network, namely pitch. The result for BPNN with DTW-FF plus pitch feature achieved its high recognition rate faster than the combination of BPNN and DTW-FF feature only. School of Postgraduate Studies, UTM 2006-07-26 Article NonPeerReviewed application/pdf en http://eprints.utm.my/id/eprint/1655/1/rubita05_NN_with_DTWFF.pdf Sudirman, Rubita and Salleh, Sh-Hussain and Salleh, Shaharuddin (2006) NN with DTW-FF Coefficients and Pitch Feature for Speaker Recognition. Regional Postgraduate Conference on Engineering and Science . pp. 775-779.
institution	Universiti Teknologi Malaysia
building	UTM Library
collection	Institutional Repository
continent	Asia
country	Malaysia
content_provider	Universiti Teknologi Malaysia
content_source	UTM Institutional Repository
url_provider	http://eprints.utm.my/
language	English
topic	TK Electrical engineering. Electronics Nuclear engineering
spellingShingle	TK Electrical engineering. Electronics Nuclear engineering Sudirman, Rubita Salleh, Sh-Hussain Salleh, Shaharuddin NN with DTW-FF Coefficients and Pitch Feature for Speaker Recognition
description	This paper proposes a new method to extract speech features in a warping path using dynamic programming (DP). The new method presented in this paper described how the LPC feature is extracted and those coefficients are normalized against the template pattern according to the selected average number of frames over the samples collected. The idea behind this method is due to neural network (NN) limitation where a fixed amount of input nodes is needed for every input class especially in the application of multiple inputs. The new feature processing used the modified version of traditional DTW called as DTW-FF algorithm to fix the input size so that the source and template frames have equal number of frames. Then the DTW-FF coefficients are retained and later being used as inputs into the MLP neural network training and testing. Thus, the main objective of this research is to find an alternative method to reduce the amount of computation and complexity in a neural network for speaker recognition which can be done by reducing the number of inputs into the network by using warping process, so the local distance scores of the warping path will be utilized instead of the global distance scores. The speaker recognition is performed using the back-propagation neural network (BPNN) algorithm to enhance the recognition performance. The results compare DTW using LPC coefficients to BPNN with DTW-FF coefficients; BPNN with DTW-FF coefficients shows a higher recognition rate than DTW with LPC coefficients. The last task is to introduce another input feature into the neural network, namely pitch. The result for BPNN with DTW-FF plus pitch feature achieved its high recognition rate faster than the combination of BPNN and DTW-FF feature only.
format	Article
author	Sudirman, Rubita Salleh, Sh-Hussain Salleh, Shaharuddin
author_facet	Sudirman, Rubita Salleh, Sh-Hussain Salleh, Shaharuddin
author_sort	Sudirman, Rubita
title	NN with DTW-FF Coefficients and Pitch Feature for Speaker Recognition
title_short	NN with DTW-FF Coefficients and Pitch Feature for Speaker Recognition
title_full	NN with DTW-FF Coefficients and Pitch Feature for Speaker Recognition
title_fullStr	NN with DTW-FF Coefficients and Pitch Feature for Speaker Recognition
title_full_unstemmed	NN with DTW-FF Coefficients and Pitch Feature for Speaker Recognition
title_sort	nn with dtw-ff coefficients and pitch feature for speaker recognition
publisher	School of Postgraduate Studies, UTM
publishDate	2006
url	http://eprints.utm.my/id/eprint/1655/1/rubita05_NN_with_DTWFF.pdf http://eprints.utm.my/id/eprint/1655/
_version_	1643643383469572096
score	13.209306

NN with DTW-FF Coefficients and Pitch Feature for Speaker Recognition

Similar Items