Sub-band Vector Quantized Variational AutoEncoder for Spectral Envelope Quantization

Tanasan Srikotr, Kazunori Mano

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Recently, a lot of deep learning model successful in taking over conventional methods in speech processing fields. Vector quantization is a popular technique to reduce the amount of speech data before transmitting. The conventional vector quantization method is based on the mathematical model. Last few years, the Vector Quantized Variational AutoEncoder has been proposed for an end-to-end vector quantization based on deep learning techniques. In this paper, we investigate the sub-band quantization in the Vector Quantized Variational AutoEncoder. This model can concentrate on specific frequency bands to assign more bits and leave the unnecessary band with few bits. Experimental results show the efficiency of the proposed quantization method for the spectral envelope parameters of the high-quality vocoder that operates at 48 kHz sampling frequency named WORLD vocoder. At the same four target bit rates, the sub-band Vector Quantized Variational AutoEncoder can reduce the Log Spectral Distortion around 0.93 dB in average.

Original languageEnglish
Title of host publicationProceedings of the TENCON 2019
Subtitle of host publicationTechnology, Knowledge, and Society
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages296-300
Number of pages5
ISBN (Electronic)9781728118956
DOIs
Publication statusPublished - 2019 Oct
Event2019 IEEE Region 10 Conference: Technology, Knowledge, and Society, TENCON 2019 - Kerala, India
Duration: 2019 Oct 172019 Oct 20

Publication series

NameIEEE Region 10 Annual International Conference, Proceedings/TENCON
Volume2019-October
ISSN (Print)2159-3442
ISSN (Electronic)2159-3450

Conference

Conference2019 IEEE Region 10 Conference: Technology, Knowledge, and Society, TENCON 2019
CountryIndia
CityKerala
Period19/10/1719/10/20

Fingerprint

Vector quantization
Speech processing
Frequency bands
Mathematical models
Sampling
Deep learning

Keywords

  • autoencoder
  • sub-band coding
  • vector quantization
  • vector quantized variational autoencoder

ASJC Scopus subject areas

  • Computer Science Applications
  • Electrical and Electronic Engineering

Cite this

Srikotr, T., & Mano, K. (2019). Sub-band Vector Quantized Variational AutoEncoder for Spectral Envelope Quantization. In Proceedings of the TENCON 2019: Technology, Knowledge, and Society (pp. 296-300). [8929436] (IEEE Region 10 Annual International Conference, Proceedings/TENCON; Vol. 2019-October). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/TENCON.2019.8929436

Sub-band Vector Quantized Variational AutoEncoder for Spectral Envelope Quantization. / Srikotr, Tanasan; Mano, Kazunori.

Proceedings of the TENCON 2019: Technology, Knowledge, and Society. Institute of Electrical and Electronics Engineers Inc., 2019. p. 296-300 8929436 (IEEE Region 10 Annual International Conference, Proceedings/TENCON; Vol. 2019-October).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Srikotr, T & Mano, K 2019, Sub-band Vector Quantized Variational AutoEncoder for Spectral Envelope Quantization. in Proceedings of the TENCON 2019: Technology, Knowledge, and Society., 8929436, IEEE Region 10 Annual International Conference, Proceedings/TENCON, vol. 2019-October, Institute of Electrical and Electronics Engineers Inc., pp. 296-300, 2019 IEEE Region 10 Conference: Technology, Knowledge, and Society, TENCON 2019, Kerala, India, 19/10/17. https://doi.org/10.1109/TENCON.2019.8929436
Srikotr T, Mano K. Sub-band Vector Quantized Variational AutoEncoder for Spectral Envelope Quantization. In Proceedings of the TENCON 2019: Technology, Knowledge, and Society. Institute of Electrical and Electronics Engineers Inc. 2019. p. 296-300. 8929436. (IEEE Region 10 Annual International Conference, Proceedings/TENCON). https://doi.org/10.1109/TENCON.2019.8929436
Srikotr, Tanasan ; Mano, Kazunori. / Sub-band Vector Quantized Variational AutoEncoder for Spectral Envelope Quantization. Proceedings of the TENCON 2019: Technology, Knowledge, and Society. Institute of Electrical and Electronics Engineers Inc., 2019. pp. 296-300 (IEEE Region 10 Annual International Conference, Proceedings/TENCON).
@inproceedings{2a9ba0c941b64976883df6e47b13795f,
title = "Sub-band Vector Quantized Variational AutoEncoder for Spectral Envelope Quantization",
abstract = "Recently, a lot of deep learning model successful in taking over conventional methods in speech processing fields. Vector quantization is a popular technique to reduce the amount of speech data before transmitting. The conventional vector quantization method is based on the mathematical model. Last few years, the Vector Quantized Variational AutoEncoder has been proposed for an end-to-end vector quantization based on deep learning techniques. In this paper, we investigate the sub-band quantization in the Vector Quantized Variational AutoEncoder. This model can concentrate on specific frequency bands to assign more bits and leave the unnecessary band with few bits. Experimental results show the efficiency of the proposed quantization method for the spectral envelope parameters of the high-quality vocoder that operates at 48 kHz sampling frequency named WORLD vocoder. At the same four target bit rates, the sub-band Vector Quantized Variational AutoEncoder can reduce the Log Spectral Distortion around 0.93 dB in average.",
keywords = "autoencoder, sub-band coding, vector quantization, vector quantized variational autoencoder",
author = "Tanasan Srikotr and Kazunori Mano",
year = "2019",
month = "10",
doi = "10.1109/TENCON.2019.8929436",
language = "English",
series = "IEEE Region 10 Annual International Conference, Proceedings/TENCON",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
pages = "296--300",
booktitle = "Proceedings of the TENCON 2019",

}

TY - GEN

T1 - Sub-band Vector Quantized Variational AutoEncoder for Spectral Envelope Quantization

AU - Srikotr, Tanasan

AU - Mano, Kazunori

PY - 2019/10

Y1 - 2019/10

N2 - Recently, a lot of deep learning model successful in taking over conventional methods in speech processing fields. Vector quantization is a popular technique to reduce the amount of speech data before transmitting. The conventional vector quantization method is based on the mathematical model. Last few years, the Vector Quantized Variational AutoEncoder has been proposed for an end-to-end vector quantization based on deep learning techniques. In this paper, we investigate the sub-band quantization in the Vector Quantized Variational AutoEncoder. This model can concentrate on specific frequency bands to assign more bits and leave the unnecessary band with few bits. Experimental results show the efficiency of the proposed quantization method for the spectral envelope parameters of the high-quality vocoder that operates at 48 kHz sampling frequency named WORLD vocoder. At the same four target bit rates, the sub-band Vector Quantized Variational AutoEncoder can reduce the Log Spectral Distortion around 0.93 dB in average.

AB - Recently, a lot of deep learning model successful in taking over conventional methods in speech processing fields. Vector quantization is a popular technique to reduce the amount of speech data before transmitting. The conventional vector quantization method is based on the mathematical model. Last few years, the Vector Quantized Variational AutoEncoder has been proposed for an end-to-end vector quantization based on deep learning techniques. In this paper, we investigate the sub-band quantization in the Vector Quantized Variational AutoEncoder. This model can concentrate on specific frequency bands to assign more bits and leave the unnecessary band with few bits. Experimental results show the efficiency of the proposed quantization method for the spectral envelope parameters of the high-quality vocoder that operates at 48 kHz sampling frequency named WORLD vocoder. At the same four target bit rates, the sub-band Vector Quantized Variational AutoEncoder can reduce the Log Spectral Distortion around 0.93 dB in average.

KW - autoencoder

KW - sub-band coding

KW - vector quantization

KW - vector quantized variational autoencoder

UR - http://www.scopus.com/inward/record.url?scp=85077681707&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85077681707&partnerID=8YFLogxK

U2 - 10.1109/TENCON.2019.8929436

DO - 10.1109/TENCON.2019.8929436

M3 - Conference contribution

AN - SCOPUS:85077681707

T3 - IEEE Region 10 Annual International Conference, Proceedings/TENCON

SP - 296

EP - 300

BT - Proceedings of the TENCON 2019

PB - Institute of Electrical and Electronics Engineers Inc.

ER -