The multilayer perceptron vector quantized variational autoencoder for spectral envelope quantization

Tanasan Srikotr, Kazunori Mano

研究成果: Conference contribution

抄録

Recently, deep generative learning has been introduced to replace the conventional mathematical models. In speech processing, the vector quantization was the effective compression method to reduce the amount of speech data before transmitting. In this paper, we propose The Multilayer Perceptron Vector Quantized Variational Autoencoder (MLP-VQ-VAE) to manage the flexibility of controlling the number of z-latent vectors to quantize and embedding space size efficiently. The MLP-VQVAE replaces the Convolutional Neural Network (CNN) with Multilayer Perceptron (MLP) in the encoder network and the decoder network of Vector Quantized Variational AutoEncoder (VQ-VAE) to receive the size of the effectively z-latent vectors for quantization and also the ability of dimensional reduction. In the experiments, the MLP-VQ-VAE is applied to quantize spectral envelope parameters from the 48 kHz high-quality vocoder named WORLD. The MLP-VQ-VAE reduces the memory sizes of the representation of z-latent or the length of vectors to quantize and embedding space size or codebook size around 1.6 times compared to the conventional vector quantization and 21.4 times for VQVAE. The proposed method decreases the Log Spectral Distortion around 1.1 dB lower than the conventional VQ and around 2.5 dB than the VQ-VAE.

本文言語English
ホスト出版物のタイトル2020 IEEE International Conference on Consumer Electronics, ICCE 2020
出版社Institute of Electrical and Electronics Engineers Inc.
ISBN(電子版)9781728151861
DOI
出版ステータスPublished - 2020 1月
イベント2020 IEEE International Conference on Consumer Electronics, ICCE 2020 - Las Vegas, United States
継続期間: 2020 1月 42020 1月 6

出版物シリーズ

名前Digest of Technical Papers - IEEE International Conference on Consumer Electronics
2020-January
ISSN(印刷版)0747-668X

Conference

Conference2020 IEEE International Conference on Consumer Electronics, ICCE 2020
国/地域United States
CityLas Vegas
Period20/1/420/1/6

ASJC Scopus subject areas

  • 産業および生産工学
  • 電子工学および電気工学

フィンガープリント

「The multilayer perceptron vector quantized variational autoencoder for spectral envelope quantization」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル