An Iterated Two-Step Sinusoidal Pitch Contour Formulation for Expressive Speech Synthesis

Intonation generation in expressive speech such as storytelling is essential to produce high quality Malay language expressive speech synthesizer. Intonation generation, for instance explicit control, has shown good performance in terms of intelligibility with reasonably natural speech; thus, it was...

Full description

Saved in:
Bibliographic Details
Main Authors: Ramli, Izzad, Jamil, Nursuriati, Seman, Noraini
Format: Article
Language:English
Published: Universiti Utara Malaysia Press 2021
Subjects:
Online Access:https://repo.uum.edu.my/id/eprint/28759/1/JICT%2020%2004%202021%20489-510.pdf
https://repo.uum.edu.my/id/eprint/28759/
https://doi.org/10.32890/jict2021.20.4.2
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.uum.repo.28759
record_format eprints
spelling my.uum.repo.287592022-07-27T02:04:24Z https://repo.uum.edu.my/id/eprint/28759/ An Iterated Two-Step Sinusoidal Pitch Contour Formulation for Expressive Speech Synthesis Ramli, Izzad Jamil, Nursuriati Seman, Noraini QA75 Electronic computers. Computer science Intonation generation in expressive speech such as storytelling is essential to produce high quality Malay language expressive speech synthesizer. Intonation generation, for instance explicit control, has shown good performance in terms of intelligibility with reasonably natural speech; thus, it was selected in this research. This approach modifies the prosodic features, such as pitch contour, intensity, and duration, to generate the intonation. However, modification of pitch contour remains a problem because the desired pitch contour is not achieved. This paper formulated an improved pitch contour algorithm to develop a modified pitch contour resembling the natural pitch contour. In this work, the syllable pitch contours of nine storytellers were extracted from their storytelling speeches to create an expressive speech syllable dataset called STORY_DATA. All the shapes of pitch contours from STORY_DATA were analyzed and clustered into the standard six main pitch contour clusters for storytelling. The clustering was performed using one minus the Pearson product moment correlation. Then, an improved iterative two-step sinusoidal pitch contour formulation was introduced to modify the pitch contours of a neutral speech into an expressive pitch contour of natural speeches. Overall, the improved pitch contour formulation was able to achieve 93 percent high correlated matches, indicating the high resemblance as compared to the previous pitch contour formulation at 15 percent. Therefore, the improved formula can be used in a text-to-speech (TTS) synthesizer to produce a more natural expressive speech. The paper also discovered unique expressive pitch contours in the Malay language that need further investigations in the future. Universiti Utara Malaysia Press 2021 Article PeerReviewed application/pdf en https://repo.uum.edu.my/id/eprint/28759/1/JICT%2020%2004%202021%20489-510.pdf Ramli, Izzad and Jamil, Nursuriati and Seman, Noraini (2021) An Iterated Two-Step Sinusoidal Pitch Contour Formulation for Expressive Speech Synthesis. Journal of Information and Communication Technology, 20 (04). pp. 489-510. ISSN 2180-3862 https://doi.org/10.32890/jict2021.20.4.2
institution Universiti Utara Malaysia
building UUM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Utara Malaysia
content_source UUM Institutional Repository
url_provider http://repo.uum.edu.my/
language English
topic QA75 Electronic computers. Computer science
spellingShingle QA75 Electronic computers. Computer science
Ramli, Izzad
Jamil, Nursuriati
Seman, Noraini
An Iterated Two-Step Sinusoidal Pitch Contour Formulation for Expressive Speech Synthesis
description Intonation generation in expressive speech such as storytelling is essential to produce high quality Malay language expressive speech synthesizer. Intonation generation, for instance explicit control, has shown good performance in terms of intelligibility with reasonably natural speech; thus, it was selected in this research. This approach modifies the prosodic features, such as pitch contour, intensity, and duration, to generate the intonation. However, modification of pitch contour remains a problem because the desired pitch contour is not achieved. This paper formulated an improved pitch contour algorithm to develop a modified pitch contour resembling the natural pitch contour. In this work, the syllable pitch contours of nine storytellers were extracted from their storytelling speeches to create an expressive speech syllable dataset called STORY_DATA. All the shapes of pitch contours from STORY_DATA were analyzed and clustered into the standard six main pitch contour clusters for storytelling. The clustering was performed using one minus the Pearson product moment correlation. Then, an improved iterative two-step sinusoidal pitch contour formulation was introduced to modify the pitch contours of a neutral speech into an expressive pitch contour of natural speeches. Overall, the improved pitch contour formulation was able to achieve 93 percent high correlated matches, indicating the high resemblance as compared to the previous pitch contour formulation at 15 percent. Therefore, the improved formula can be used in a text-to-speech (TTS) synthesizer to produce a more natural expressive speech. The paper also discovered unique expressive pitch contours in the Malay language that need further investigations in the future.
format Article
author Ramli, Izzad
Jamil, Nursuriati
Seman, Noraini
author_facet Ramli, Izzad
Jamil, Nursuriati
Seman, Noraini
author_sort Ramli, Izzad
title An Iterated Two-Step Sinusoidal Pitch Contour Formulation for Expressive Speech Synthesis
title_short An Iterated Two-Step Sinusoidal Pitch Contour Formulation for Expressive Speech Synthesis
title_full An Iterated Two-Step Sinusoidal Pitch Contour Formulation for Expressive Speech Synthesis
title_fullStr An Iterated Two-Step Sinusoidal Pitch Contour Formulation for Expressive Speech Synthesis
title_full_unstemmed An Iterated Two-Step Sinusoidal Pitch Contour Formulation for Expressive Speech Synthesis
title_sort iterated two-step sinusoidal pitch contour formulation for expressive speech synthesis
publisher Universiti Utara Malaysia Press
publishDate 2021
url https://repo.uum.edu.my/id/eprint/28759/1/JICT%2020%2004%202021%20489-510.pdf
https://repo.uum.edu.my/id/eprint/28759/
https://doi.org/10.32890/jict2021.20.4.2
_version_ 1739833587019022336
score 13.214268