Speech Recorder and Translator using Google Cloud Speech-to-Text and Translation

The most popular video website YouTube has about 2 billion users worldwide who speak and understand different languages. Subtitles are essential for the users to get the message from the video. However, not all video owners provide subtitles for their videos. It causes the potential audiences to hav...

Full description

Saved in:
Bibliographic Details
Main Authors: J.Y, Chan, Hui Hui, Wang
Format: Article
Language:English
Published: Penerbit Universiti Malaysia Sarawak 2021
Subjects:
Online Access:http://ir.unimas.my/id/eprint/36884/1/recorder1.pdf
http://ir.unimas.my/id/eprint/36884/
https://publisher.unimas.my/ojs/index.php/JITA/index
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.unimas.ir.36884
record_format eprints
spelling my.unimas.ir.368842022-09-14T07:06:53Z http://ir.unimas.my/id/eprint/36884/ Speech Recorder and Translator using Google Cloud Speech-to-Text and Translation J.Y, Chan Hui Hui, Wang Q Science (General) T Technology (General) The most popular video website YouTube has about 2 billion users worldwide who speak and understand different languages. Subtitles are essential for the users to get the message from the video. However, not all video owners provide subtitles for their videos. It causes the potential audiences to have difficulties in understanding the video content. Thus, this study proposed a speech recorder and translator to solve this problem. The general concept of this study was to combine Automatic Speech Recognition (ASR) and translation technologies to recognize the video content and translate it into other languages. This paper compared and discussed three different ASR technologies. They are Google Cloud Speech-to-Text, Limecraft Transcriber, and VoxSigma. Finally, the proposed system used Google Cloud Speech-to-Text because it supports more languages than Limecraft Transcriber and VoxSigma. Besides, it was more flexible to use with Google Cloud Translation. This paper also consisted of a questionnaire about the crucial features of the speech recorder and translator. There was a total of 19 university students participated in the questionnaire. Most of the respondents stated that high translation accuracy is vital for the proposed system. This paper also discussed a related work of speech recorder and translator. It was a study that compared speech recognition between ordinary voice and speech impaired voice. It used a mobile application to record acoustic voice input. Compared to the existing mobile App, this project proposed a web application. It was a different and new study, especially in terms of development and user experience. Finally, this project developed the proposed system successfully. The results showed that Google Cloud Speech-to-Text and Translation were reliable to use in video translation. However, it could not recognize the speech when the background music was too loud. Besides, it had a problem of direct translation, which was challenging. Thus, future research may need a custom trained model. In conclusion, the proposed system in this project was to contribute a new idea of a web application to solve the language barrier on the video watching platform. Penerbit Universiti Malaysia Sarawak 2021-11-30 Article PeerReviewed text en http://ir.unimas.my/id/eprint/36884/1/recorder1.pdf J.Y, Chan and Hui Hui, Wang (2021) Speech Recorder and Translator using Google Cloud Speech-to-Text and Translation. Journal of IT in Asia, Vol.09 (2021). pp. 11-28. ISSN 1823-5042 https://publisher.unimas.my/ojs/index.php/JITA/index
institution Universiti Malaysia Sarawak
building Centre for Academic Information Services (CAIS)
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Malaysia Sarawak
content_source UNIMAS Institutional Repository
url_provider http://ir.unimas.my/
language English
topic Q Science (General)
T Technology (General)
spellingShingle Q Science (General)
T Technology (General)
J.Y, Chan
Hui Hui, Wang
Speech Recorder and Translator using Google Cloud Speech-to-Text and Translation
description The most popular video website YouTube has about 2 billion users worldwide who speak and understand different languages. Subtitles are essential for the users to get the message from the video. However, not all video owners provide subtitles for their videos. It causes the potential audiences to have difficulties in understanding the video content. Thus, this study proposed a speech recorder and translator to solve this problem. The general concept of this study was to combine Automatic Speech Recognition (ASR) and translation technologies to recognize the video content and translate it into other languages. This paper compared and discussed three different ASR technologies. They are Google Cloud Speech-to-Text, Limecraft Transcriber, and VoxSigma. Finally, the proposed system used Google Cloud Speech-to-Text because it supports more languages than Limecraft Transcriber and VoxSigma. Besides, it was more flexible to use with Google Cloud Translation. This paper also consisted of a questionnaire about the crucial features of the speech recorder and translator. There was a total of 19 university students participated in the questionnaire. Most of the respondents stated that high translation accuracy is vital for the proposed system. This paper also discussed a related work of speech recorder and translator. It was a study that compared speech recognition between ordinary voice and speech impaired voice. It used a mobile application to record acoustic voice input. Compared to the existing mobile App, this project proposed a web application. It was a different and new study, especially in terms of development and user experience. Finally, this project developed the proposed system successfully. The results showed that Google Cloud Speech-to-Text and Translation were reliable to use in video translation. However, it could not recognize the speech when the background music was too loud. Besides, it had a problem of direct translation, which was challenging. Thus, future research may need a custom trained model. In conclusion, the proposed system in this project was to contribute a new idea of a web application to solve the language barrier on the video watching platform.
format Article
author J.Y, Chan
Hui Hui, Wang
author_facet J.Y, Chan
Hui Hui, Wang
author_sort J.Y, Chan
title Speech Recorder and Translator using Google Cloud Speech-to-Text and Translation
title_short Speech Recorder and Translator using Google Cloud Speech-to-Text and Translation
title_full Speech Recorder and Translator using Google Cloud Speech-to-Text and Translation
title_fullStr Speech Recorder and Translator using Google Cloud Speech-to-Text and Translation
title_full_unstemmed Speech Recorder and Translator using Google Cloud Speech-to-Text and Translation
title_sort speech recorder and translator using google cloud speech-to-text and translation
publisher Penerbit Universiti Malaysia Sarawak
publishDate 2021
url http://ir.unimas.my/id/eprint/36884/1/recorder1.pdf
http://ir.unimas.my/id/eprint/36884/
https://publisher.unimas.my/ojs/index.php/JITA/index
_version_ 1744357760189857792
score 13.23648