Staff View: Bidirectional parallel echo state network for speech emotion recognition

Bidirectional parallel echo state network for speech emotion recognition

Speech is an effective way for communicating and exchanging complex information between humans. Speech signal has involved a great attention in human-computer interaction. Therefore, emotion recognition from speech has become a hot research topic in the field of interacting machines with humans. In...

Full description

Saved in:

Bibliographic Details
Main Authors:	Ibrahim, Hemin, Loo, Chu Kiong, Alnajjar, Fady
Format:	Article
Published:	Springer London Ltd 2022
Subjects:	QA75 Electronic computers. Computer science
Online Access:	http://eprints.um.edu.my/41245/
Tags:	Add Tag No Tags, Be the first to tag this record!

id	my.um.eprints.41245
record_format	eprints
spelling	my.um.eprints.412452023-09-15T03:07:07Z http://eprints.um.edu.my/41245/ Bidirectional parallel echo state network for speech emotion recognition Ibrahim, Hemin Loo, Chu Kiong Alnajjar, Fady QA75 Electronic computers. Computer science Speech is an effective way for communicating and exchanging complex information between humans. Speech signal has involved a great attention in human-computer interaction. Therefore, emotion recognition from speech has become a hot research topic in the field of interacting machines with humans. In this paper, we proposed a novel speech emotion recognition system by adopting multivariate time series handcrafted feature representation from speech signals. Bidirectional echo state network with two parallel reservoir layers has been applied to capture additional independent information. The parallel reservoirs produce multiple representations for each direction from the bidirectional data with two stages of concatenation. The sparse random projection approach has been adopted to reduce the high-dimensional sparse output for each direction separately from both reservoirs. Random over-sampling and random under-sampling methods are used to overcome the imbalanced nature of the used speech emotion datasets. The performance of the proposed parallel ESN model is evaluated from the speaker-independent experiments on EMO-DB, SAVEE, RAVDESS, and FAU Aibo datasets. The results show that the proposed SER model is superior to the single reservoir and the state-of-the-art studies. Springer London Ltd 2022-10 Article PeerReviewed Ibrahim, Hemin and Loo, Chu Kiong and Alnajjar, Fady (2022) Bidirectional parallel echo state network for speech emotion recognition. Neural Computing & Applications, 34 (20). pp. 17581-17599. ISSN 0941-0643, DOI https://doi.org/10.1007/s00521-022-07410-2 <https://doi.org/10.1007/s00521-022-07410-2>. 10.1007/s00521-022-07410-2
institution	Universiti Malaya
building	UM Library
collection	Institutional Repository
continent	Asia
country	Malaysia
content_provider	Universiti Malaya
content_source	UM Research Repository
url_provider	http://eprints.um.edu.my/
topic	QA75 Electronic computers. Computer science
spellingShingle	QA75 Electronic computers. Computer science Ibrahim, Hemin Loo, Chu Kiong Alnajjar, Fady Bidirectional parallel echo state network for speech emotion recognition
description	Speech is an effective way for communicating and exchanging complex information between humans. Speech signal has involved a great attention in human-computer interaction. Therefore, emotion recognition from speech has become a hot research topic in the field of interacting machines with humans. In this paper, we proposed a novel speech emotion recognition system by adopting multivariate time series handcrafted feature representation from speech signals. Bidirectional echo state network with two parallel reservoir layers has been applied to capture additional independent information. The parallel reservoirs produce multiple representations for each direction from the bidirectional data with two stages of concatenation. The sparse random projection approach has been adopted to reduce the high-dimensional sparse output for each direction separately from both reservoirs. Random over-sampling and random under-sampling methods are used to overcome the imbalanced nature of the used speech emotion datasets. The performance of the proposed parallel ESN model is evaluated from the speaker-independent experiments on EMO-DB, SAVEE, RAVDESS, and FAU Aibo datasets. The results show that the proposed SER model is superior to the single reservoir and the state-of-the-art studies.
format	Article
author	Ibrahim, Hemin Loo, Chu Kiong Alnajjar, Fady
author_facet	Ibrahim, Hemin Loo, Chu Kiong Alnajjar, Fady
author_sort	Ibrahim, Hemin
title	Bidirectional parallel echo state network for speech emotion recognition
title_short	Bidirectional parallel echo state network for speech emotion recognition
title_full	Bidirectional parallel echo state network for speech emotion recognition
title_fullStr	Bidirectional parallel echo state network for speech emotion recognition
title_full_unstemmed	Bidirectional parallel echo state network for speech emotion recognition
title_sort	bidirectional parallel echo state network for speech emotion recognition
publisher	Springer London Ltd
publishDate	2022
url	http://eprints.um.edu.my/41245/
_version_	1778161646592589824
score	13.160551

Bidirectional parallel echo state network for speech emotion recognition

Similar Items