A calculation cost reduction method for a log-likelihood maximization in word2vec

Sakuya Nakamura, Masaomi Kimura

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Word2vec models learn text data and provide distributed representations to words. The distributed representations use vectors which show the meaning of the words. Thus the word2vec models are useful for Natural Language Processing (NLP). However, it is difficult to update the models for new data addition because it takes a long time to generate the word2vec model. This calculation time has become an impediment to analize text data which contains a lot of unknown words. This is caused by computational time in the calculation of the likelihood function. The purpose of this study was to speed up the training of Continuous Bag-of-Word Model(CBOW), which is one of the word2vec models, by reducing the calculation cost of the likelihood function. The likelihood function in CBOW has been expressed by the use of a softmax function and has a huge amount of computational time. In this paper, a sigmoid function replaces the softmax function as the approximated likelihood function, because the sigmoid function can reproduce the charactaristic change of the likelihood function in CBOW.

Original languageEnglish
Title of host publicationICAC 2019 - 2019 25th IEEE International Conference on Automation and Computing
EditorsHui Yu
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781861376664
DOIs
Publication statusPublished - 2019 Sep
Event25th IEEE International Conference on Automation and Computing, ICAC 2019 - Lancaster, United Kingdom
Duration: 2019 Sep 52019 Sep 7

Publication series

NameICAC 2019 - 2019 25th IEEE International Conference on Automation and Computing

Conference

Conference25th IEEE International Conference on Automation and Computing, ICAC 2019
CountryUnited Kingdom
CityLancaster
Period19/9/519/9/7

Keywords

  • CBOW
  • Component
  • Computational time
  • Softmax
  • Training acceleration
  • Word2Vec

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Networks and Communications
  • Computer Science Applications
  • Control and Optimization

Fingerprint Dive into the research topics of 'A calculation cost reduction method for a log-likelihood maximization in word2vec'. Together they form a unique fingerprint.

Cite this