Development of language identification system using MFCC and vector quantization

This paper investigates the development of language identification based on Mel-Frequency Cepstral Coefficients (MFCC) and Vector Quantization (VQ) algorithm. In this study, a total of ten speakers were chosen randomly with different languages from online language database. A total of six males and...

Full description

Saved in:
Bibliographic Details
Main Authors: Gunawan, Teddy Surya, Husain, Rashida, Kartiwi, Mira
Format: Conference or Workshop Item
Language:English
Published: 2017
Subjects:
Online Access:http://irep.iium.edu.my/60070/13/60070-Development%20of%20Language%20Identification.pdf
http://irep.iium.edu.my/60070/
http://icsima.ieeemy-ims.org/17/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:This paper investigates the development of language identification based on Mel-Frequency Cepstral Coefficients (MFCC) and Vector Quantization (VQ) algorithm. In this study, a total of ten speakers were chosen randomly with different languages from online language database. A total of six males and four females were selected as subjects for this research and each of them spoke different languages, including Arabic, Chinese, English, Korean and Malay. The MFCC will be extracted to derive the related feature vector. Vector Quantization (VQ) algorithm is then used as classifier. The recognition rate is then calculated for each language. Several experiments were conducted to find the optimum parameters, in which we found that sampling frequency of 16000 Hz and codebook size of 75 provided good results. On average, the recognition rate for all five languages evaluated was 78%. The experimental results show that our proposed system provides a good recognition rate.