Voice pathology detection and classification by adopting online sequential extreme learning machine

In the last decade, the implementation of machine learning algorithms in the analysis of voice disorder is paramount in order to provide a non-invasive voice pathology detection by only using audio signal. In spite of that, most recent systems of voice pathology work on a limited acoustic database....

Full description

Saved in:
Bibliographic Details
Main Authors: Al-Dhief, F. T., Baki, M. M., Abdul Latiff, N. M., Malik, N. N. N. A., Salim, N. S., Albader, M. A. A., Mahyuddin, N. M., Mohammed, M. A.
Format: Article
Language:English
Published: Institute of Electrical and Electronics Engineers Inc. 2021
Subjects:
Online Access:http://eprints.utm.my/id/eprint/95129/1/NurulMuazzahAbdulLatiff2021_VoicePathologyDetection.pdf
http://eprints.utm.my/id/eprint/95129/
http://dx.doi.org/10.1109/ACCESS.2021.3082565
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In the last decade, the implementation of machine learning algorithms in the analysis of voice disorder is paramount in order to provide a non-invasive voice pathology detection by only using audio signal. In spite of that, most recent systems of voice pathology work on a limited acoustic database. In other words, the systems use one vowel, such as /a/, and ignore sentences and other vowels when analyzing the audio signal. Other key issues that should be considered in the systems are accuracy and time consumption of an algorithm. Online Sequential Extreme Learning Machine (OSELM) is one of the machine learning algorithms that can be regarded as a rapid and accurate algorithm in the classification process. Therefore, this paper presents a voice pathology detection and classification system by using OSELM algorithm as a classifier, and Mel-frequency cepstral coefficient (MFCC) as a featured extraction. In this work, the voice samples were taken from the Saarbrücken voice database (SVD). This system involves two parts of the database; the first part includes all voices in SVD with sentences and vowels /a/, /i/, and /u/, which are uttered in high, low, and normal pitches; and the second part utilizes voice samples of the common three types of pathologies (cyst, polyp, and paralysis) based on the vowel /a/ that is produced in normal pitch. The experimental results have shown that OSELM was able to achieve the highest accuracy up to 91.17%, 94% of precision, and 91% of recall. Furthermore, OSELM obtained 87%, 87.55%, and 97.67% for f-measure, G-mean, and specificity, respectively. The proposed system also presents a high ability to achieve detection and classification results in real-time clinical applications.