End-to-end Conversion Speed Analysis of an FPT.AI-based Text-to-Speech Application

In this paper, an FPT.AI-based text-to-speech (TTS) application is developed that converts Vietnamese text into spoken words. The application is developed based on Django for Python and in the form of an interactive web page which is connected to an FPT.AI server through its application programming...

Full description

Saved in:
Bibliographic Details
Main Authors: Chung, T.D., Drieberg, M., Bin Hassan, M.F., Khalyasmaa, A.
Format: Conference or Workshop Item
Published: Institute of Electrical and Electronics Engineers Inc. 2020
Online Access:https://www.scopus.com/inward/record.uri?eid=2-s2.0-85085167676&doi=10.1109%2fLifeTech48969.2020.1570620448&partnerID=40&md5=69ffc5e130e473e9d8519397ab61a341
http://eprints.utp.edu.my/24638/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In this paper, an FPT.AI-based text-to-speech (TTS) application is developed that converts Vietnamese text into spoken words. The application is developed based on Django for Python and in the form of an interactive web page which is connected to an FPT.AI server through its application programming interface (API). The application supports conversion of text to seven different Vietnamese speeches. Four out of seven voices can be used to convert up to 500 characters in a single transaction while the others support that of 400 characters. Based on the results obtained, the first conversion time takes up to 10 s to convert 400-character text into speech while the subsequent times, given same text, it takes under 1.8 s for the conversion. This is applicable to all voices. © 2020 IEEE.