The multilayer perceptron vector quantized variational autoencoder for spectral envelope quantization

Tanasan Srikotr, Kazunori Mano

研究成果: Conference contribution

抜粋

Recently, deep generative learning has been introduced to replace the conventional mathematical models. In speech processing, the vector quantization was the effective compression method to reduce the amount of speech data before transmitting. In this paper, we propose The Multilayer Perceptron Vector Quantized Variational Autoencoder (MLP-VQ-VAE) to manage the flexibility of controlling the number of z-latent vectors to quantize and embedding space size efficiently. The MLP-VQVAE replaces the Convolutional Neural Network (CNN) with Multilayer Perceptron (MLP) in the encoder network and the decoder network of Vector Quantized Variational AutoEncoder (VQ-VAE) to receive the size of the effectively z-latent vectors for quantization and also the ability of dimensional reduction. In the experiments, the MLP-VQ-VAE is applied to quantize spectral envelope parameters from the 48 kHz high-quality vocoder named WORLD. The MLP-VQ-VAE reduces the memory sizes of the representation of z-latent or the length of vectors to quantize and embedding space size or codebook size around 1.6 times compared to the conventional vector quantization and 21.4 times for VQVAE. The proposed method decreases the Log Spectral Distortion around 1.1 dB lower than the conventional VQ and around 2.5 dB than the VQ-VAE.

元の言語English
ホスト出版物のタイトル2020 IEEE International Conference on Consumer Electronics, ICCE 2020
出版者Institute of Electrical and Electronics Engineers Inc.
ISBN(電子版)9781728151861
DOI
出版物ステータスPublished - 2020 1
イベント2020 IEEE International Conference on Consumer Electronics, ICCE 2020 - Las Vegas, United States
継続期間: 2020 1 42020 1 6

出版物シリーズ

名前Digest of Technical Papers - IEEE International Conference on Consumer Electronics
2020-January
ISSN(印刷物)0747-668X

Conference

Conference2020 IEEE International Conference on Consumer Electronics, ICCE 2020
United States
Las Vegas
期間20/1/420/1/6

ASJC Scopus subject areas

  • Industrial and Manufacturing Engineering
  • Electrical and Electronic Engineering

フィンガープリント The multilayer perceptron vector quantized variational autoencoder for spectral envelope quantization' の研究トピックを掘り下げます。これらはともに一意のフィンガープリントを構成します。

  • これを引用

    Srikotr, T., & Mano, K. (2020). The multilayer perceptron vector quantized variational autoencoder for spectral envelope quantization. : 2020 IEEE International Conference on Consumer Electronics, ICCE 2020 [9043006] (Digest of Technical Papers - IEEE International Conference on Consumer Electronics; 巻数 2020-January). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICCE46568.2020.9043006