Staff View: A robust speaker-aware speech separation technique using composite speech models

A robust speaker-aware speech separation technique using composite speech models

Speech separation techniques are commonly used for selective filtering of audio sources. Early works apply acoustic profiling to discriminate against multiple audio sources. Meanwhile, modern techniques leverage on composite audio-visual cues for a more precise audio source separation. With visual i...

Full description

Saved in:

Bibliographic Details
Main Author:	Mak, Wen Xuan
Format:	Final Year Project / Dissertation / Thesis
Published:	2020
Subjects:	Q Science (General)
Online Access:	http://eprints.utar.edu.my/3906/1/16ACB04621_FYP.pdf http://eprints.utar.edu.my/3906/
Tags:	Add Tag No Tags, Be the first to tag this record!

id	my-utar-eprints.3906
record_format	eprints
spelling	my-utar-eprints.39062021-01-07T06:49:39Z A robust speaker-aware speech separation technique using composite speech models Mak, Wen Xuan Q Science (General) Speech separation techniques are commonly used for selective filtering of audio sources. Early works apply acoustic profiling to discriminate against multiple audio sources. Meanwhile, modern techniques leverage on composite audio-visual cues for a more precise audio source separation. With visual input, speakers are firstly recognized for their facial features, then voice-matched for corresponding audio signal filtering. However, existing speech separation techniques do not account for off-screen speakers when they are actively speaking in these videos. This project aims to design a robust speaker-aware speech separation pipeline to accommodate speech separation for offscreen speakers. The pipeline essentially performs speech separation in a sequential fashion, starting from (1) audio-visual speech separation for all visible speakers, then (2) performing blind source separation on residual audio signal to determine off-screen speech. Two independent models are designed, namely an audio-only and an audiovisual model, which is then merged together to form a pipeline that performs comprehensive speech separation. The outcome of the project is a data type agnostic speech separation technique that demonstrates robust filtering performance regardless of input types. 2020-05-15 Final Year Project / Dissertation / Thesis NonPeerReviewed application/pdf http://eprints.utar.edu.my/3906/1/16ACB04621_FYP.pdf Mak, Wen Xuan (2020) A robust speaker-aware speech separation technique using composite speech models. Final Year Project, UTAR. http://eprints.utar.edu.my/3906/
institution	Universiti Tunku Abdul Rahman
building	UTAR Library
collection	Institutional Repository
continent	Asia
country	Malaysia
content_provider	Universiti Tunku Abdul Rahman
content_source	UTAR Institutional Repository
url_provider	http://eprints.utar.edu.my
topic	Q Science (General)
spellingShingle	Q Science (General) Mak, Wen Xuan A robust speaker-aware speech separation technique using composite speech models
description	Speech separation techniques are commonly used for selective filtering of audio sources. Early works apply acoustic profiling to discriminate against multiple audio sources. Meanwhile, modern techniques leverage on composite audio-visual cues for a more precise audio source separation. With visual input, speakers are firstly recognized for their facial features, then voice-matched for corresponding audio signal filtering. However, existing speech separation techniques do not account for off-screen speakers when they are actively speaking in these videos. This project aims to design a robust speaker-aware speech separation pipeline to accommodate speech separation for offscreen speakers. The pipeline essentially performs speech separation in a sequential fashion, starting from (1) audio-visual speech separation for all visible speakers, then (2) performing blind source separation on residual audio signal to determine off-screen speech. Two independent models are designed, namely an audio-only and an audiovisual model, which is then merged together to form a pipeline that performs comprehensive speech separation. The outcome of the project is a data type agnostic speech separation technique that demonstrates robust filtering performance regardless of input types.
format	Final Year Project / Dissertation / Thesis
author	Mak, Wen Xuan
author_facet	Mak, Wen Xuan
author_sort	Mak, Wen Xuan
title	A robust speaker-aware speech separation technique using composite speech models
title_short	A robust speaker-aware speech separation technique using composite speech models
title_full	A robust speaker-aware speech separation technique using composite speech models
title_fullStr	A robust speaker-aware speech separation technique using composite speech models
title_full_unstemmed	A robust speaker-aware speech separation technique using composite speech models
title_sort	robust speaker-aware speech separation technique using composite speech models
publishDate	2020
url	http://eprints.utar.edu.my/3906/1/16ACB04621_FYP.pdf http://eprints.utar.edu.my/3906/
_version_	1688551790838022144
score	13.19449

A robust speaker-aware speech separation technique using composite speech models

Similar Items