Reinforcement learning in non-markovian environments using automatic discovery of subgoals

Le Tien Dung, Takashi Komeda, Motoki Takagi

Research output: Chapter in Book/Report/Conference proceedingConference contribution

5 Citations (Scopus)

Abstract

Learning time is always a critical issue in Reinforcement Learning, especially when Recurrent Neural Networks (RNNs) are used to predict Q values. By creating useful subgoals, we can speed up learning performance. In this paper, we propose a method to accelerate learning in non-Markovian environments using automatic discovery of subgoals. Once subgoals are created, sub-policies use RNNs to attain them. Then learned RNNs are integrated into the main RNN as experts. Finally, the agent continues to learn using its new policy. Experiment results of the E maze problem and the virtual office problem show the potential of this approach.

Original languageEnglish
Title of host publicationProceedings of the SICE Annual Conference
Pages2601-2605
Number of pages5
DOIs
Publication statusPublished - 2007
EventSICE(Society of Instrument and Control Engineers)Annual Conference, SICE 2007 - Takamatsu
Duration: 2007 Sep 172007 Sep 20

Other

OtherSICE(Society of Instrument and Control Engineers)Annual Conference, SICE 2007
CityTakamatsu
Period07/9/1707/9/20

    Fingerprint

Keywords

  • Selected keywords relevant to the subject

ASJC Scopus subject areas

  • Engineering(all)

Cite this

Dung, L. T., Komeda, T., & Takagi, M. (2007). Reinforcement learning in non-markovian environments using automatic discovery of subgoals. In Proceedings of the SICE Annual Conference (pp. 2601-2605). [4421430] https://doi.org/10.1109/SICE.2007.4421430