Staff View: On the use of voice activity detection in speech emotion recognition

On the use of voice activity detection in speech emotion recognition

Emotion recognition through speech has many potential applications, however the challenge comes from achieving a high emotion recognition while using limited resources or interference such as noise. In this paper we have explored the possibility of improving speech emotion recognition by utilizing t...

Full description

Saved in:

Bibliographic Details
Main Authors:	Alghifari, Muhammad Fahreza, Gunawan, Teddy Surya, Wan Nordin, Mimi Aminah, Ahmad Qadri, Syed Asif, Kartiwi, Mira, Janin, Zuriati
Format:	Article
Language:	English English
Published:	Institute of Advanced Engineering and Science 2019
Subjects:	T Technology (General) TK7885 Computer engineering
Online Access:	http://irep.iium.edu.my/73890/1/73890_On%20the%20Use%20of%20Voice%20Activity.pdf http://irep.iium.edu.my/73890/7/73890_On%20the%20use%20of%20voice%20activity%20detection%20in%20speech%20emotion%20recognition_Scopus.pdf http://irep.iium.edu.my/73890/ http://www.beei.org/index.php/EEI/article/view/1646/1208
Tags:	Add Tag No Tags, Be the first to tag this record!

id	my.iium.irep.73890
record_format	dspace
spelling	my.iium.irep.738902020-02-26T08:19:41Z http://irep.iium.edu.my/73890/ On the use of voice activity detection in speech emotion recognition Alghifari, Muhammad Fahreza Gunawan, Teddy Surya Wan Nordin, Mimi Aminah Ahmad Qadri, Syed Asif Kartiwi, Mira Janin, Zuriati T Technology (General) TK7885 Computer engineering Emotion recognition through speech has many potential applications, however the challenge comes from achieving a high emotion recognition while using limited resources or interference such as noise. In this paper we have explored the possibility of improving speech emotion recognition by utilizing the voice activity detection (VAD) concept. The emotional voice data from the Berlin Emotion Database (EMO-DB) and a custom-made database LQ Audio Dataset are firstly preprocessed by VAD before feature extraction. The features are then passed to the deep neural network for classification. In this paper, we have chosen MFCC to be the sole determinant feature. From the results obtained using VAD and without, we have found that the VAD improved the recognition rate of 5 emotions (happy, angry, sad, fear, and neutral) by 3.7% when recognizing clean signals, while the effect of using VAD when training a network with both clean and noisy signals improved our previous results by 50%. Institute of Advanced Engineering and Science 2019-12 Article PeerReviewed application/pdf en http://irep.iium.edu.my/73890/1/73890_On%20the%20Use%20of%20Voice%20Activity.pdf application/pdf en http://irep.iium.edu.my/73890/7/73890_On%20the%20use%20of%20voice%20activity%20detection%20in%20speech%20emotion%20recognition_Scopus.pdf Alghifari, Muhammad Fahreza and Gunawan, Teddy Surya and Wan Nordin, Mimi Aminah and Ahmad Qadri, Syed Asif and Kartiwi, Mira and Janin, Zuriati (2019) On the use of voice activity detection in speech emotion recognition. Bulletin of Electrical Engineering and Informatics, 8 (4). pp. 1324-1332. ISSN 2302-9285 E-ISSN 2302-9285 http://www.beei.org/index.php/EEI/article/view/1646/1208 10.11591/eei.v8i4.1646
institution	Universiti Islam Antarabangsa Malaysia
building	IIUM Library
collection	Institutional Repository
continent	Asia
country	Malaysia
content_provider	International Islamic University Malaysia
content_source	IIUM Repository (IREP)
url_provider	http://irep.iium.edu.my/
language	English English
topic	T Technology (General) TK7885 Computer engineering
spellingShingle	T Technology (General) TK7885 Computer engineering Alghifari, Muhammad Fahreza Gunawan, Teddy Surya Wan Nordin, Mimi Aminah Ahmad Qadri, Syed Asif Kartiwi, Mira Janin, Zuriati On the use of voice activity detection in speech emotion recognition
description	Emotion recognition through speech has many potential applications, however the challenge comes from achieving a high emotion recognition while using limited resources or interference such as noise. In this paper we have explored the possibility of improving speech emotion recognition by utilizing the voice activity detection (VAD) concept. The emotional voice data from the Berlin Emotion Database (EMO-DB) and a custom-made database LQ Audio Dataset are firstly preprocessed by VAD before feature extraction. The features are then passed to the deep neural network for classification. In this paper, we have chosen MFCC to be the sole determinant feature. From the results obtained using VAD and without, we have found that the VAD improved the recognition rate of 5 emotions (happy, angry, sad, fear, and neutral) by 3.7% when recognizing clean signals, while the effect of using VAD when training a network with both clean and noisy signals improved our previous results by 50%.
format	Article
author	Alghifari, Muhammad Fahreza Gunawan, Teddy Surya Wan Nordin, Mimi Aminah Ahmad Qadri, Syed Asif Kartiwi, Mira Janin, Zuriati
author_facet	Alghifari, Muhammad Fahreza Gunawan, Teddy Surya Wan Nordin, Mimi Aminah Ahmad Qadri, Syed Asif Kartiwi, Mira Janin, Zuriati
author_sort	Alghifari, Muhammad Fahreza
title	On the use of voice activity detection in speech emotion recognition
title_short	On the use of voice activity detection in speech emotion recognition
title_full	On the use of voice activity detection in speech emotion recognition
title_fullStr	On the use of voice activity detection in speech emotion recognition
title_full_unstemmed	On the use of voice activity detection in speech emotion recognition
title_sort	on the use of voice activity detection in speech emotion recognition
publisher	Institute of Advanced Engineering and Science
publishDate	2019
url	http://irep.iium.edu.my/73890/1/73890_On%20the%20Use%20of%20Voice%20Activity.pdf http://irep.iium.edu.my/73890/7/73890_On%20the%20use%20of%20voice%20activity%20detection%20in%20speech%20emotion%20recognition_Scopus.pdf http://irep.iium.edu.my/73890/ http://www.beei.org/index.php/EEI/article/view/1646/1208
_version_	1662753737700016128
score	13.250246

On the use of voice activity detection in speech emotion recognition

Similar Items