Staff View: A Pipeline To Data Preprocessing For Lipreading And Audio-Visual Speech Recognition

A Pipeline To Data Preprocessing For Lipreading And Audio-Visual Speech Recognition

Studies show that only about 30 to 45 percent of English language can be understood by lipreading alone. Even the most talented lip readers are unable to collect a complete message based on lipreading only, although they are often very good at interpreting facial features, body language, and context...

Full description

Saved in:

Bibliographic Details
Main Authors:	Ngo, Hea Choon, Hashim, Ummi Rabaah, Raja Ikram, Raja Rina, Salahuddin, Lizawati, Teoh, Mok Lee
Format:	Article
Language:	English
Published:	World Academy of Research in Science and Engineering 2020
Online Access:	http://eprints.utem.edu.my/id/eprint/25010/2/2020%2C%20NGO%2C%20AUDIO-VISUAL%20SPEECH%20-%20IJATCSE_01.PDF http://eprints.utem.edu.my/id/eprint/25010/ http://www.warse.org/IJATCSE/static/pdf/file/ijatcse58942020.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!

id	my.utem.eprints.25010
record_format	eprints
spelling	my.utem.eprints.250102021-04-20T12:28:13Z http://eprints.utem.edu.my/id/eprint/25010/ A Pipeline To Data Preprocessing For Lipreading And Audio-Visual Speech Recognition Ngo, Hea Choon Hashim, Ummi Rabaah Raja Ikram, Raja Rina Salahuddin, Lizawati Teoh, Mok Lee Studies show that only about 30 to 45 percent of English language can be understood by lipreading alone. Even the most talented lip readers are unable to collect a complete message based on lipreading only, although they are often very good at interpreting facial features, body language, and context to find out. As you can imagine, this technique affects the brain in different ways and becomes exhausting over a period of time. If a person who is deaf, uses language and is able to read lips, hearing people may not understand the challenges they are facing just to have a simple one-on-one conversation. The hearing person may be annoyed that they are often asked to repeat themselves or to speak more slowly and clearly. They could lose patience and break off the conversation. In our modern world, where technology connects us in a way never thought possible, there are a variety of ways to communicate with another person. Deaf people come from all walks of life and with different backgrounds. In this study, a lipreading model is being developed that is able to record, analyze, translate the movement of lips and display them into subtitles. A model is trained with GRID Corpus, MIRACL-VC1 and pre-trained dataset and with the LipNet model to build a system which deaf people can decode text from the movement of a speaker’s mouth. This system will help the deaf people understand what others are actually saying and communicate more effectively. As a conclusion, this system helps deaf people to communicate effectively with others. World Academy of Research in Science and Engineering 2020-08 Article PeerReviewed text en http://eprints.utem.edu.my/id/eprint/25010/2/2020%2C%20NGO%2C%20AUDIO-VISUAL%20SPEECH%20-%20IJATCSE_01.PDF Ngo, Hea Choon and Hashim, Ummi Rabaah and Raja Ikram, Raja Rina and Salahuddin, Lizawati and Teoh, Mok Lee (2020) A Pipeline To Data Preprocessing For Lipreading And Audio-Visual Speech Recognition. International Journal of Advanced Trends in Computer Science and Engineering, 9 (4). pp. 4589-4596. ISSN 2278-3091 http://www.warse.org/IJATCSE/static/pdf/file/ijatcse58942020.pdf 10.30534/ijatcse/2020/58942020
institution	Universiti Teknikal Malaysia Melaka
building	UTEM Library
collection	Institutional Repository
continent	Asia
country	Malaysia
content_provider	Universiti Teknikal Malaysia Melaka
content_source	UTEM Institutional Repository
url_provider	http://eprints.utem.edu.my/
language	English
description	Studies show that only about 30 to 45 percent of English language can be understood by lipreading alone. Even the most talented lip readers are unable to collect a complete message based on lipreading only, although they are often very good at interpreting facial features, body language, and context to find out. As you can imagine, this technique affects the brain in different ways and becomes exhausting over a period of time. If a person who is deaf, uses language and is able to read lips, hearing people may not understand the challenges they are facing just to have a simple one-on-one conversation. The hearing person may be annoyed that they are often asked to repeat themselves or to speak more slowly and clearly. They could lose patience and break off the conversation. In our modern world, where technology connects us in a way never thought possible, there are a variety of ways to communicate with another person. Deaf people come from all walks of life and with different backgrounds. In this study, a lipreading model is being developed that is able to record, analyze, translate the movement of lips and display them into subtitles. A model is trained with GRID Corpus, MIRACL-VC1 and pre-trained dataset and with the LipNet model to build a system which deaf people can decode text from the movement of a speaker’s mouth. This system will help the deaf people understand what others are actually saying and communicate more effectively. As a conclusion, this system helps deaf people to communicate effectively with others.
format	Article
author	Ngo, Hea Choon Hashim, Ummi Rabaah Raja Ikram, Raja Rina Salahuddin, Lizawati Teoh, Mok Lee
spellingShingle	Ngo, Hea Choon Hashim, Ummi Rabaah Raja Ikram, Raja Rina Salahuddin, Lizawati Teoh, Mok Lee A Pipeline To Data Preprocessing For Lipreading And Audio-Visual Speech Recognition
author_facet	Ngo, Hea Choon Hashim, Ummi Rabaah Raja Ikram, Raja Rina Salahuddin, Lizawati Teoh, Mok Lee
author_sort	Ngo, Hea Choon
title	A Pipeline To Data Preprocessing For Lipreading And Audio-Visual Speech Recognition
title_short	A Pipeline To Data Preprocessing For Lipreading And Audio-Visual Speech Recognition
title_full	A Pipeline To Data Preprocessing For Lipreading And Audio-Visual Speech Recognition
title_fullStr	A Pipeline To Data Preprocessing For Lipreading And Audio-Visual Speech Recognition
title_full_unstemmed	A Pipeline To Data Preprocessing For Lipreading And Audio-Visual Speech Recognition
title_sort	pipeline to data preprocessing for lipreading and audio-visual speech recognition
publisher	World Academy of Research in Science and Engineering
publishDate	2020
url	http://eprints.utem.edu.my/id/eprint/25010/2/2020%2C%20NGO%2C%20AUDIO-VISUAL%20SPEECH%20-%20IJATCSE_01.PDF http://eprints.utem.edu.my/id/eprint/25010/ http://www.warse.org/IJATCSE/static/pdf/file/ijatcse58942020.pdf
_version_	1698700618387947520
score	13.1944895

A Pipeline To Data Preprocessing For Lipreading And Audio-Visual Speech Recognition

Similar Items