A clustering experiment of the spectra and the spectral changes of speech to extract phonemic features

Katsuhiko Shirai, Kazunori Mano

研究成果: Article査読

2 被引用数 (Scopus)


As a step towards phoneme identification, a method of clustering speech spectra and spectral changes is discussed. In this technique, two kinds of acoustic features are defined in each frame of analysis. The first feature, called a feature of Level 1, shows a spectral contour of a frame which is represented by LPC cepstral coefficients. The second feature, called a feature of Level 2, shows a spectral change in a frame, which is defined by the difference between the LPC cepstral coefficients derived from the first half and the second half of a frame. A phonemic feature of each frame is defined as a triplet of phonemic names. The acoustical features of Levels 1 and 2 are calculated from 800 V, VV, CV, VCV (vowel, vowel-vowel, consonant-vowel, vowel-consonant-vowel) syllables uttered by one male and clustered with an algorithm of vector quantizer design. This VQ design method is based on the one by Linde, Buzo and Gray (1980). However, the proposed VQ method is slightly modified to consider frame labels belonging to each cluster. As a result, each frame is characterized by the cluster numbers, or the centroid numbers, of Level 1 and Level 2. The relation between the cluster numbers and the phonemic feature was investigated. It was found that the number of different phonemic labels corresponding to each cluster was less than five. In the resulting 5503 clusters, the existing combinations of Level 1 and Level 2 codes (centroid numbers), 4428 clusters had only one kind of label.

ジャーナルSignal Processing
出版ステータスPublished - 1986 4月

ASJC Scopus subject areas

  • 制御およびシステム工学
  • ソフトウェア
  • 信号処理
  • コンピュータ ビジョンおよびパターン認識
  • 電子工学および電気工学


「A clustering experiment of the spectra and the spectral changes of speech to extract phonemic features」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。