Recording script design for corpus-based TTS system based on coverage of various phonetic elements

Mitsuaki Isogai, Hideyuki Mizuno, Kazunori Mano

研究成果: Conference contribution

13 引用 (Scopus)

抜粋

This paper describes a new recording script generation method that can create speech databases for corpus-based TTS systems. This method is efficient due to its two features; (1) It has a 2-stage algorithm to generate the recording script with consideration of the balance of triphone, syllable and morpheme elements. (2) It can control types of phonetic elements included in the recording script via the weight coefficients of the phonetic elements. An evaluation shows that the 2-stage algorithm is effective in raising the coverage of phonetic elements and that this method yields a recording script containing various phonetic elements. A preference test shows that changing the selection criteria influences the quality of the synthesized speech. The same test also shows that it is better to take account of morpheme-based elements than syllable-based elements in generating a task-specific recording script.

元の言語English
ホスト出版物のタイトル2005 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP '05 - Proceedings - Image and Multidimensional Signal Processing Multimedia Signal Processing
出版者Institute of Electrical and Electronics Engineers Inc.
ページI301-I304
ISBN(印刷物)0780388747, 9780780388741
DOI
出版物ステータスPublished - 2005 1 1
イベント2005 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP '05 - Philadelphia, PA, United States
継続期間: 2005 3 182005 3 23

出版物シリーズ

名前ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
I
ISSN(印刷物)1520-6149

Conference

Conference2005 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP '05
United States
Philadelphia, PA
期間05/3/1805/3/23

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Electrical and Electronic Engineering

フィンガープリント Recording script design for corpus-based TTS system based on coverage of various phonetic elements' の研究トピックを掘り下げます。これらはともに一意のフィンガープリントを構成します。

  • これを引用

    Isogai, M., Mizuno, H., & Mano, K. (2005). Recording script design for corpus-based TTS system based on coverage of various phonetic elements. : 2005 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP '05 - Proceedings - Image and Multidimensional Signal Processing Multimedia Signal Processing (pp. I301-I304). [1415110] (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings; 巻数 I). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICASSP.2005.1415110