BENCHMARKING WHISPER OPENAI ON SARAWAK LANGUAGES
The end-to-end (E2E) model is influentially reshaping the automatic speech recognition (ASR) scene, supplanting traditional ASR models such as the Hidden Markov model (HMM) and Deep Neural Network (DNN)-based hybrid models. In essence, it displaces crucial components of these traditional ASR models...
Saved in:
Main Author: | |
---|---|
Format: | Final Year Project Report |
Language: | English |
Published: |
Universiti Malaysia Sarawak, (UNIMAS)
2023
|
Subjects: | |
Online Access: | http://ir.unimas.my/id/eprint/44201/1/Gerald%20Einstein%20ft.pdf http://ir.unimas.my/id/eprint/44201/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
id |
my.unimas.ir.44201 |
---|---|
record_format |
eprints |
spelling |
my.unimas.ir.442012024-01-18T03:34:45Z http://ir.unimas.my/id/eprint/44201/ BENCHMARKING WHISPER OPENAI ON SARAWAK LANGUAGES GERALD EINSTEIN CORNELIUS PE English The end-to-end (E2E) model is influentially reshaping the automatic speech recognition (ASR) scene, supplanting traditional ASR models such as the Hidden Markov model (HMM) and Deep Neural Network (DNN)-based hybrid models. In essence, it displaces crucial components of these traditional ASR models by simplifying the module-based design into a single-network architecture inside a deep learning framework. Interestingly, this simplified technique does not hinder the performance of this worthy successor of a model in recognising speech, while it even yields results that are superior to those of traditional ASR models. Recognising its infinite potential, OpenAI have developed the robust Whisper model based on the E2E, encoder-decoder transformer. While the aforementioned model performs exceptionally well for English ASR, its undetermined performance on low resource languages is a topic of research interest. In this work, the performance evaluation of the Whisper model on Sarawak languages will be explored. This model will be evaluated using speech data from under-resourced Sarawak languages, namely the Sarawak Malay, Iban, Melanau, and the Bidayuh dialects of Jagoi and Bukar Sadong. Fundamentally, a systematic literature review (SLR) and the development of an ASR system built on the Whisper model to uncover the recognition accuracy of Whisper OpenAI on Sarawak languages are the key highlights of this work. The experiment results obtained from the developed ASR system, based on the Word Error Rate (WER) evaluation metric may serve as a baseline for future works based on the integrated Whisper model for under-resource Sarawak languages. Universiti Malaysia Sarawak, (UNIMAS) 2023 Final Year Project Report NonPeerReviewed text en http://ir.unimas.my/id/eprint/44201/1/Gerald%20Einstein%20ft.pdf GERALD EINSTEIN CORNELIUS (2023) BENCHMARKING WHISPER OPENAI ON SARAWAK LANGUAGES. [Final Year Project Report] (Unpublished) |
institution |
Universiti Malaysia Sarawak |
building |
Centre for Academic Information Services (CAIS) |
collection |
Institutional Repository |
continent |
Asia |
country |
Malaysia |
content_provider |
Universiti Malaysia Sarawak |
content_source |
UNIMAS Institutional Repository |
url_provider |
http://ir.unimas.my/ |
language |
English |
topic |
PE English |
spellingShingle |
PE English GERALD EINSTEIN CORNELIUS BENCHMARKING WHISPER OPENAI ON SARAWAK LANGUAGES |
description |
The end-to-end (E2E) model is influentially reshaping the automatic speech recognition (ASR) scene, supplanting traditional ASR models such as the Hidden Markov model (HMM) and Deep Neural Network (DNN)-based hybrid models. In essence, it displaces crucial components of these traditional ASR models by simplifying the module-based design into a single-network architecture inside a deep learning framework. Interestingly, this simplified technique does not hinder the performance of this worthy successor of a model in recognising speech, while it even yields results that are superior to those of traditional ASR models. Recognising its infinite potential, OpenAI have developed the robust Whisper model based on the E2E, encoder-decoder transformer. While the aforementioned model performs exceptionally well for English ASR, its undetermined performance on low resource languages is a topic of research interest. In this work, the performance evaluation of the Whisper model on Sarawak languages will be explored. This model will be evaluated using speech data from under-resourced Sarawak languages, namely the Sarawak Malay, Iban, Melanau, and the Bidayuh dialects of Jagoi and Bukar Sadong. Fundamentally, a systematic literature review (SLR) and the development of an ASR system built on the Whisper model to uncover the recognition accuracy of Whisper OpenAI on Sarawak languages are the key highlights of this work. The experiment results obtained from the developed ASR system, based on the Word Error Rate (WER) evaluation metric may serve as a baseline for future works based on the integrated Whisper model for under-resource Sarawak languages. |
format |
Final Year Project Report |
author |
GERALD EINSTEIN CORNELIUS |
author_facet |
GERALD EINSTEIN CORNELIUS |
author_sort |
GERALD EINSTEIN CORNELIUS |
title |
BENCHMARKING WHISPER OPENAI ON SARAWAK LANGUAGES |
title_short |
BENCHMARKING WHISPER OPENAI ON SARAWAK LANGUAGES |
title_full |
BENCHMARKING WHISPER OPENAI ON SARAWAK LANGUAGES |
title_fullStr |
BENCHMARKING WHISPER OPENAI ON SARAWAK LANGUAGES |
title_full_unstemmed |
BENCHMARKING WHISPER OPENAI ON SARAWAK LANGUAGES |
title_sort |
benchmarking whisper openai on sarawak languages |
publisher |
Universiti Malaysia Sarawak, (UNIMAS) |
publishDate |
2023 |
url |
http://ir.unimas.my/id/eprint/44201/1/Gerald%20Einstein%20ft.pdf http://ir.unimas.my/id/eprint/44201/ |
_version_ |
1789430374176129024 |
score |
13.209306 |