Fundamental Research Grant Scheme (FRGS) - FRGS19-076-0684, Speech Emotion Recognition and Depression Prediction Based on Speech Analysis using Deep Neural Networks

Speech signals contain a lot of information that can be used by computers to gain insight into a user's state, such as emotion recognition and depression prediction. Numerous applications exist, ranging from customer service to depression prevention. We propose several deep-learning-based metho...

Full description

Saved in:
Bibliographic Details
Main Authors: Gunawan, Teddy Surya, Draman, Samsul, Kartiwi, Mira, Borhan, Lihanna, Abdul Malik, Noreha, Abdul Rahman, Farah Diyana, Elsheikh, Elsheikh Mohamed Ahmed, Alghifari, Muhammad Fahreza, Ahmad Qadri, Syed Asif, Ashraf, Arselan, Wani, Taiba Majid
Format: Monograph
Language:English
Published: 2022
Subjects:
Online Access:http://irep.iium.edu.my/96854/1/GunawanFRGS19-076-0684FinalReportFeb22.pdf
http://irep.iium.edu.my/96854/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Speech signals contain a lot of information that can be used by computers to gain insight into a user's state, such as emotion recognition and depression prediction. Numerous applications exist, ranging from customer service to depression prevention. We propose several deep-learning-based methodologies for detecting emotion and depression in this research. We used variants of deep neural networks such as deep feedforward networks and convolutional networks. The deep learning model was trained using well-known databases such as the Berlin Emotion Database and the DAIC-WOZ Depression Dataset. The algorithm achieves an accuracy of 80.5 percent for speech emotion recognition across four languages: English, German, French, and Italian. The current algorithm detects depression with a 60.1 percent accuracy when tested on the DAIC-WOZ dataset. Additionally, this research resulted in the creation of the Sorrow Analysis Dataset – an English depression audio dataset comprised of 64 distinct samples of depressed and non-depressed individuals. Further validation using 1-dimensional convolutional networks resulted in an average accuracy of 97 percent. Further research could be conducted using other deep learning architectures, other datasets, and implementation on edge computing.