The multilayer perceptron vector quantized variational autoencoder for spectral envelope quantization

Tanasan Srikotr, Kazunori Mano

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Recently, deep generative learning has been introduced to replace the conventional mathematical models. In speech processing, the vector quantization was the effective compression method to reduce the amount of speech data before transmitting. In this paper, we propose The Multilayer Perceptron Vector Quantized Variational Autoencoder (MLP-VQ-VAE) to manage the flexibility of controlling the number of z-latent vectors to quantize and embedding space size efficiently. The MLP-VQVAE replaces the Convolutional Neural Network (CNN) with Multilayer Perceptron (MLP) in the encoder network and the decoder network of Vector Quantized Variational AutoEncoder (VQ-VAE) to receive the size of the effectively z-latent vectors for quantization and also the ability of dimensional reduction. In the experiments, the MLP-VQ-VAE is applied to quantize spectral envelope parameters from the 48 kHz high-quality vocoder named WORLD. The MLP-VQ-VAE reduces the memory sizes of the representation of z-latent or the length of vectors to quantize and embedding space size or codebook size around 1.6 times compared to the conventional vector quantization and 21.4 times for VQVAE. The proposed method decreases the Log Spectral Distortion around 1.1 dB lower than the conventional VQ and around 2.5 dB than the VQ-VAE.

Original languageEnglish
Title of host publication2020 IEEE International Conference on Consumer Electronics, ICCE 2020
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781728151861
DOIs
Publication statusPublished - 2020 Jan
Event2020 IEEE International Conference on Consumer Electronics, ICCE 2020 - Las Vegas, United States
Duration: 2020 Jan 42020 Jan 6

Publication series

NameDigest of Technical Papers - IEEE International Conference on Consumer Electronics
Volume2020-January
ISSN (Print)0747-668X

Conference

Conference2020 IEEE International Conference on Consumer Electronics, ICCE 2020
Country/TerritoryUnited States
CityLas Vegas
Period20/1/420/1/6

ASJC Scopus subject areas

  • Industrial and Manufacturing Engineering
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'The multilayer perceptron vector quantized variational autoencoder for spectral envelope quantization'. Together they form a unique fingerprint.

Cite this