Improving Speaker Diarrization for Low-Resourced Sarawak Malay Language Conversational Speech Corpus
Speaker diarization plays a vital role in speech transcription involving conversations as it improves the transcribed content’s accuracy, comprehension, and usability. By having a speech transcription diarized, the conversation data has a more structured presentation, allowing for a variety of appli...
Saved in:
Main Authors: | , , |
---|---|
Format: | Proceeding |
Language: | English |
Published: |
IEEE
2023
|
Subjects: | |
Online Access: | http://ir.unimas.my/id/eprint/43786/3/Improving%20Speaker%20Diarization.pdf http://ir.unimas.my/id/eprint/43786/ https://ieeexplore.ieee.org/document/10337314 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
id |
my.unimas.ir.43786 |
---|---|
record_format |
eprints |
spelling |
my.unimas.ir.437862023-12-20T01:43:32Z http://ir.unimas.my/id/eprint/43786/ Improving Speaker Diarrization for Low-Resourced Sarawak Malay Language Conversational Speech Corpus Mohd Zulhafiz, Rahim Sarah Flora, Samson Juan Fitri Suraya, Mohamad QA75 Electronic computers. Computer science Speaker diarization plays a vital role in speech transcription involving conversations as it improves the transcribed content’s accuracy, comprehension, and usability. By having a speech transcription diarized, the conversation data has a more structured presentation, allowing for a variety of applications that rely on accurate speaker attribution. Even so, speaker diarization is a field that has been less explored for low-resourced languages, as current resources that have been optimized and applied in speaker diarization are mostly for more developed and well-resourced languages, such as English, Spanish or French. In this paper, we propose an approach to using pseudo-labelled speech data to perform self-training on the x-vector models to improve diarization accuracy. The proposed method uses almost 13 hours Sarawak Malay unlabeled conversational speech corpus obtained from the Kalaka: Language Map of Malaysia website for training, as well as 1 hour and 26 minutes of manually labeled Sarawak Malay speech data for testing and evaluation. We demonstrate how speaker diarization models can be fine-tuned with the pseudo-labeled data. IEEE 2023-12-12 Proceeding PeerReviewed text en http://ir.unimas.my/id/eprint/43786/3/Improving%20Speaker%20Diarization.pdf Mohd Zulhafiz, Rahim and Sarah Flora, Samson Juan and Fitri Suraya, Mohamad (2023) Improving Speaker Diarrization for Low-Resourced Sarawak Malay Language Conversational Speech Corpus. In: 2023 International Conference on Asian Language Processing (IALP), 18-20 November 2023, Singapore. https://ieeexplore.ieee.org/document/10337314 |
institution |
Universiti Malaysia Sarawak |
building |
Centre for Academic Information Services (CAIS) |
collection |
Institutional Repository |
continent |
Asia |
country |
Malaysia |
content_provider |
Universiti Malaysia Sarawak |
content_source |
UNIMAS Institutional Repository |
url_provider |
http://ir.unimas.my/ |
language |
English |
topic |
QA75 Electronic computers. Computer science |
spellingShingle |
QA75 Electronic computers. Computer science Mohd Zulhafiz, Rahim Sarah Flora, Samson Juan Fitri Suraya, Mohamad Improving Speaker Diarrization for Low-Resourced Sarawak Malay Language Conversational Speech Corpus |
description |
Speaker diarization plays a vital role in speech transcription involving conversations as it improves the transcribed content’s accuracy, comprehension, and usability. By having a speech transcription diarized, the conversation data has a more structured presentation, allowing for a variety of applications that rely on accurate speaker attribution. Even so, speaker diarization is a field that has been less explored for low-resourced languages, as current resources that have been optimized and applied in speaker diarization are mostly for more developed and well-resourced languages, such as English, Spanish or French. In this paper, we propose an approach to using pseudo-labelled speech data to perform self-training on the x-vector models to improve diarization accuracy. The proposed method uses almost 13 hours Sarawak Malay unlabeled conversational speech corpus obtained from the Kalaka: Language Map of Malaysia website for training, as well as 1 hour and 26 minutes of manually labeled Sarawak Malay speech data for testing and evaluation. We demonstrate how speaker diarization models can be fine-tuned with the pseudo-labeled data. |
format |
Proceeding |
author |
Mohd Zulhafiz, Rahim Sarah Flora, Samson Juan Fitri Suraya, Mohamad |
author_facet |
Mohd Zulhafiz, Rahim Sarah Flora, Samson Juan Fitri Suraya, Mohamad |
author_sort |
Mohd Zulhafiz, Rahim |
title |
Improving Speaker Diarrization for Low-Resourced Sarawak Malay Language Conversational Speech Corpus |
title_short |
Improving Speaker Diarrization for Low-Resourced Sarawak Malay Language Conversational Speech Corpus |
title_full |
Improving Speaker Diarrization for Low-Resourced Sarawak Malay Language Conversational Speech Corpus |
title_fullStr |
Improving Speaker Diarrization for Low-Resourced Sarawak Malay Language Conversational Speech Corpus |
title_full_unstemmed |
Improving Speaker Diarrization for Low-Resourced Sarawak Malay Language Conversational Speech Corpus |
title_sort |
improving speaker diarrization for low-resourced sarawak malay language conversational speech corpus |
publisher |
IEEE |
publishDate |
2023 |
url |
http://ir.unimas.my/id/eprint/43786/3/Improving%20Speaker%20Diarization.pdf http://ir.unimas.my/id/eprint/43786/ https://ieeexplore.ieee.org/document/10337314 |
_version_ |
1787140540937011200 |
score |
13.211869 |