Reinforcement learning in non-markovian environments using automatic discovery of subgoals

Le Tien Dung, Takashi Komeda, Motoki Takagi

Research output: Chapter in Book/Report/Conference proceedingConference contribution

5 Citations (Scopus)

Abstract

Learning time is always a critical issue in Reinforcement Learning, especially when Recurrent Neural Networks (RNNs) are used to predict Q values. By creating useful subgoals, we can speed up learning performance. In this paper, we propose a method to accelerate learning in non-Markovian environments using automatic discovery of subgoals. Once subgoals are created, sub-policies use RNNs to attain them. Then learned RNNs are integrated into the main RNN as experts. Finally, the agent continues to learn using its new policy. Experiment results of the E maze problem and the virtual office problem show the potential of this approach.

Original languageEnglish
Title of host publicationProceedings of the SICE Annual Conference
Pages2601-2605
Number of pages5
DOIs
Publication statusPublished - 2007
EventSICE(Society of Instrument and Control Engineers)Annual Conference, SICE 2007 - Takamatsu
Duration: 2007 Sep 172007 Sep 20

Other

OtherSICE(Society of Instrument and Control Engineers)Annual Conference, SICE 2007
CityTakamatsu
Period07/9/1707/9/20

Fingerprint

Recurrent neural networks
Reinforcement learning
Experiments

Keywords

  • Selected keywords relevant to the subject

ASJC Scopus subject areas

  • Engineering(all)

Cite this

Dung, L. T., Komeda, T., & Takagi, M. (2007). Reinforcement learning in non-markovian environments using automatic discovery of subgoals. In Proceedings of the SICE Annual Conference (pp. 2601-2605). [4421430] https://doi.org/10.1109/SICE.2007.4421430

Reinforcement learning in non-markovian environments using automatic discovery of subgoals. / Dung, Le Tien; Komeda, Takashi; Takagi, Motoki.

Proceedings of the SICE Annual Conference. 2007. p. 2601-2605 4421430.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Dung, LT, Komeda, T & Takagi, M 2007, Reinforcement learning in non-markovian environments using automatic discovery of subgoals. in Proceedings of the SICE Annual Conference., 4421430, pp. 2601-2605, SICE(Society of Instrument and Control Engineers)Annual Conference, SICE 2007, Takamatsu, 07/9/17. https://doi.org/10.1109/SICE.2007.4421430
Dung LT, Komeda T, Takagi M. Reinforcement learning in non-markovian environments using automatic discovery of subgoals. In Proceedings of the SICE Annual Conference. 2007. p. 2601-2605. 4421430 https://doi.org/10.1109/SICE.2007.4421430
Dung, Le Tien ; Komeda, Takashi ; Takagi, Motoki. / Reinforcement learning in non-markovian environments using automatic discovery of subgoals. Proceedings of the SICE Annual Conference. 2007. pp. 2601-2605
@inproceedings{bf051816effb4203ae6c9457ff93b642,
title = "Reinforcement learning in non-markovian environments using automatic discovery of subgoals",
abstract = "Learning time is always a critical issue in Reinforcement Learning, especially when Recurrent Neural Networks (RNNs) are used to predict Q values. By creating useful subgoals, we can speed up learning performance. In this paper, we propose a method to accelerate learning in non-Markovian environments using automatic discovery of subgoals. Once subgoals are created, sub-policies use RNNs to attain them. Then learned RNNs are integrated into the main RNN as experts. Finally, the agent continues to learn using its new policy. Experiment results of the E maze problem and the virtual office problem show the potential of this approach.",
keywords = "Selected keywords relevant to the subject",
author = "Dung, {Le Tien} and Takashi Komeda and Motoki Takagi",
year = "2007",
doi = "10.1109/SICE.2007.4421430",
language = "English",
isbn = "4907764286",
pages = "2601--2605",
booktitle = "Proceedings of the SICE Annual Conference",

}

TY - GEN

T1 - Reinforcement learning in non-markovian environments using automatic discovery of subgoals

AU - Dung, Le Tien

AU - Komeda, Takashi

AU - Takagi, Motoki

PY - 2007

Y1 - 2007

N2 - Learning time is always a critical issue in Reinforcement Learning, especially when Recurrent Neural Networks (RNNs) are used to predict Q values. By creating useful subgoals, we can speed up learning performance. In this paper, we propose a method to accelerate learning in non-Markovian environments using automatic discovery of subgoals. Once subgoals are created, sub-policies use RNNs to attain them. Then learned RNNs are integrated into the main RNN as experts. Finally, the agent continues to learn using its new policy. Experiment results of the E maze problem and the virtual office problem show the potential of this approach.

AB - Learning time is always a critical issue in Reinforcement Learning, especially when Recurrent Neural Networks (RNNs) are used to predict Q values. By creating useful subgoals, we can speed up learning performance. In this paper, we propose a method to accelerate learning in non-Markovian environments using automatic discovery of subgoals. Once subgoals are created, sub-policies use RNNs to attain them. Then learned RNNs are integrated into the main RNN as experts. Finally, the agent continues to learn using its new policy. Experiment results of the E maze problem and the virtual office problem show the potential of this approach.

KW - Selected keywords relevant to the subject

UR - http://www.scopus.com/inward/record.url?scp=50249160640&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=50249160640&partnerID=8YFLogxK

U2 - 10.1109/SICE.2007.4421430

DO - 10.1109/SICE.2007.4421430

M3 - Conference contribution

SN - 4907764286

SN - 9784907764289

SP - 2601

EP - 2605

BT - Proceedings of the SICE Annual Conference

ER -