Articulatory Features for Expressive Speech Synthesis

Year of Publication2012
AuthorsBlack, A. W., H. Bunnell T., Dou Y., Kumar P., Metze F., Perry D., Polzehl T., Prahallad K., Steidl S., & Vaughn C.
This paper describes some of the results from the project entitled“New Parameterization for Emotional Speech Synthesis” held at theSummer 2011 JHU CLSP workshop. We describe experiments onhow to use articulatory features as a meaningful intermediate representationfor speech synthesis. This parameterization not only allowsus to reproduce natural sounding speech but also allows us togenerate stylistically varying speech.We show methods for deriving articulatory features from speech,predicting articulatory features from text and reconstructing naturalsounding speech from the predicted articulatory features. The methodswere tested on clean speech databases in English and German,as well as databases of emotionally and personality varying speech.The resulting speech was evaluated both objectively, using techniquesnormally used for emotion identification, and subjectively,using crowd-sourcing.

Index Terms— speech synthesis, articulatory features, emotionalspeech, meta-data extraction, evaluation

Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2012), Kyoto, Japan

