Speech to text with emoji

Speech transcription technology has been a significant life changer in the media, entertainment, and education fields. Transcript services have greatly simplified the work of record-keeping, research, and note-taking without the inconvenience of manually transcribing protracted audio or video segmen...

Full description

Saved in:
Bibliographic Details
Main Author: Tong, Kah Pau
Format: Final Year Project / Dissertation / Thesis
Published: 2023
Subjects:
Online Access:http://eprints.utar.edu.my/5889/1/SE_1901229_FYP_report%2DTongKahPau_%2D_KAH_PAU_TONG.pdf
http://eprints.utar.edu.my/5889/
Tags: Add Tag
No Tags, Be the first to tag this record!
id my-utar-eprints.5889
record_format eprints
spelling my-utar-eprints.58892023-10-05T11:57:54Z Speech to text with emoji Tong, Kah Pau QA76 Computer software Speech transcription technology has been a significant life changer in the media, entertainment, and education fields. Transcript services have greatly simplified the work of record-keeping, research, and note-taking without the inconvenience of manually transcribing protracted audio or video segments for hours at a time. However, reading plain text alone from the transcription cannot convey the messenger's emotion compared to listening to it. Humans are wired to experience many basic emotions. These fundamental emotions assist us in understanding, connecting, and communicating with others. Thus, emoticons deliver the user's emotions in the message. With our current technology, we can add emoticons to our text by choosing it manually or by saying the emoticons' names using speech recognition technology. However, this may cause some hassle and other problems. In this project, an artificial intelligence-based mobile application for emotional voice transcription was proposed to solve the difficulties of improving digital communication, increasing equality for disabled persons, and boosting attentiveness in online courses. The objectives of this project encompass examining the feasibility of voice recognition for emotion detection, develop an emotional voice recognition system that accurately measures various speech features to display appropriate emojis and create a speech-to-text solution that transcribes text with emojis at a rate comparable to the user's speech rate and emotional portrayal. Furthermore, prototyping methodology was chosen as the project approach. It consists of a requirement analysis phase, followed by a five steps repeatable cycle: design, model training, prototyping, review, and refinement, and finally, the development, test, and release phase. In conclusion, the final prototype achieved a processing speed of 10-15 seconds, a speech transcript accuracy of 99.5%, and an emotion identification accuracy of 80.3% via incremental upgrades and adjustments through the prototype and development phase. Although there were future enhancements and improvements, such as a customised voice profile, multilingual assistance, transcription sharing and system architecture change to the client and server side, the project is considered successful where all objectives are fulfilled. 2023 Final Year Project / Dissertation / Thesis NonPeerReviewed application/pdf http://eprints.utar.edu.my/5889/1/SE_1901229_FYP_report%2DTongKahPau_%2D_KAH_PAU_TONG.pdf Tong, Kah Pau (2023) Speech to text with emoji. Final Year Project, UTAR. http://eprints.utar.edu.my/5889/
institution Universiti Tunku Abdul Rahman
building UTAR Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Tunku Abdul Rahman
content_source UTAR Institutional Repository
url_provider http://eprints.utar.edu.my
topic QA76 Computer software
spellingShingle QA76 Computer software
Tong, Kah Pau
Speech to text with emoji
description Speech transcription technology has been a significant life changer in the media, entertainment, and education fields. Transcript services have greatly simplified the work of record-keeping, research, and note-taking without the inconvenience of manually transcribing protracted audio or video segments for hours at a time. However, reading plain text alone from the transcription cannot convey the messenger's emotion compared to listening to it. Humans are wired to experience many basic emotions. These fundamental emotions assist us in understanding, connecting, and communicating with others. Thus, emoticons deliver the user's emotions in the message. With our current technology, we can add emoticons to our text by choosing it manually or by saying the emoticons' names using speech recognition technology. However, this may cause some hassle and other problems. In this project, an artificial intelligence-based mobile application for emotional voice transcription was proposed to solve the difficulties of improving digital communication, increasing equality for disabled persons, and boosting attentiveness in online courses. The objectives of this project encompass examining the feasibility of voice recognition for emotion detection, develop an emotional voice recognition system that accurately measures various speech features to display appropriate emojis and create a speech-to-text solution that transcribes text with emojis at a rate comparable to the user's speech rate and emotional portrayal. Furthermore, prototyping methodology was chosen as the project approach. It consists of a requirement analysis phase, followed by a five steps repeatable cycle: design, model training, prototyping, review, and refinement, and finally, the development, test, and release phase. In conclusion, the final prototype achieved a processing speed of 10-15 seconds, a speech transcript accuracy of 99.5%, and an emotion identification accuracy of 80.3% via incremental upgrades and adjustments through the prototype and development phase. Although there were future enhancements and improvements, such as a customised voice profile, multilingual assistance, transcription sharing and system architecture change to the client and server side, the project is considered successful where all objectives are fulfilled.
format Final Year Project / Dissertation / Thesis
author Tong, Kah Pau
author_facet Tong, Kah Pau
author_sort Tong, Kah Pau
title Speech to text with emoji
title_short Speech to text with emoji
title_full Speech to text with emoji
title_fullStr Speech to text with emoji
title_full_unstemmed Speech to text with emoji
title_sort speech to text with emoji
publishDate 2023
url http://eprints.utar.edu.my/5889/1/SE_1901229_FYP_report%2DTongKahPau_%2D_KAH_PAU_TONG.pdf
http://eprints.utar.edu.my/5889/
_version_ 1779151198254792704
score 13.18916