Abstract
This paper proposes three corpora of emotional speech in Japanese that maximize the expression of each emotion (expressing joy, anger, and sadness) for use with CHATR, the concatenative speech synthesis system being developed at ATR. A perceptual experiment was conducted using the synthesized speech generated from each emotion corpus and the results proved to be significantly identifiable. Authors' current work is to identify the local acoustic features relevant for specifying a particular emotion type. F0 and duration showed significant differences among emotion types. AV (amplitude of voicing source) and GN (glottal noise) also showed differences. This paper reports on the corpus design, the perceptual experiment, and the results of the acoustic analysis.
Original language | English |
---|---|
Publication status | Published - 1998 |
Externally published | Yes |
Event | 5th International Conference on Spoken Language Processing, ICSLP 1998 - Sydney, Australia Duration: 1998 Nov 30 → 1998 Dec 4 |
Conference
Conference | 5th International Conference on Spoken Language Processing, ICSLP 1998 |
---|---|
Country/Territory | Australia |
City | Sydney |
Period | 98/11/30 → 98/12/4 |
ASJC Scopus subject areas
- Language and Linguistics
- Linguistics and Language