Efficient experience reuse in non-Markovian environments

Le Tien Dung, Takashi Komeda, Motoki Takagi

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Citations (Scopus)

Abstract

Learning time is always a critical issue in Reinforcement Learning, especially when Recurrent Neural Networks are used to predict Q values in non-Markovian environments. Experience reuse has been received much attention due to its ability to reduce learning time. In this paper, we propose a new method to efficiently reuse experience. Our method generates new episodes from recorded episodes using an action-pair merger. Recorded episodes and new episodes are replayed after each learning epoch. We compare our method with standard online learning, and learning using experience replay in a vision-based robot problem. The results show the potential of this approach.

Original languageEnglish
Title of host publicationProceedings of the SICE Annual Conference
Pages3327-3332
Number of pages6
DOIs
Publication statusPublished - 2008
EventSICE Annual Conference 2008 - International Conference on Instrumentation, Control and Information Technology - Tokyo
Duration: 2008 Aug 202008 Aug 22

Other

OtherSICE Annual Conference 2008 - International Conference on Instrumentation, Control and Information Technology
CityTokyo
Period08/8/2008/8/22

Fingerprint

Recurrent neural networks
Reinforcement learning
Robots

Keywords

  • Recurrent neural networks
  • Reinforcement learning

ASJC Scopus subject areas

  • Electrical and Electronic Engineering
  • Control and Systems Engineering
  • Computer Science Applications

Cite this

Dung, L. T., Komeda, T., & Takagi, M. (2008). Efficient experience reuse in non-Markovian environments. In Proceedings of the SICE Annual Conference (pp. 3327-3332). [4655239] https://doi.org/10.1109/SICE.2008.4655239

Efficient experience reuse in non-Markovian environments. / Dung, Le Tien; Komeda, Takashi; Takagi, Motoki.

Proceedings of the SICE Annual Conference. 2008. p. 3327-3332 4655239.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Dung, LT, Komeda, T & Takagi, M 2008, Efficient experience reuse in non-Markovian environments. in Proceedings of the SICE Annual Conference., 4655239, pp. 3327-3332, SICE Annual Conference 2008 - International Conference on Instrumentation, Control and Information Technology, Tokyo, 08/8/20. https://doi.org/10.1109/SICE.2008.4655239
Dung LT, Komeda T, Takagi M. Efficient experience reuse in non-Markovian environments. In Proceedings of the SICE Annual Conference. 2008. p. 3327-3332. 4655239 https://doi.org/10.1109/SICE.2008.4655239
Dung, Le Tien ; Komeda, Takashi ; Takagi, Motoki. / Efficient experience reuse in non-Markovian environments. Proceedings of the SICE Annual Conference. 2008. pp. 3327-3332
@inproceedings{c79adfc5754843abbcd208922d975cbc,
title = "Efficient experience reuse in non-Markovian environments",
abstract = "Learning time is always a critical issue in Reinforcement Learning, especially when Recurrent Neural Networks are used to predict Q values in non-Markovian environments. Experience reuse has been received much attention due to its ability to reduce learning time. In this paper, we propose a new method to efficiently reuse experience. Our method generates new episodes from recorded episodes using an action-pair merger. Recorded episodes and new episodes are replayed after each learning epoch. We compare our method with standard online learning, and learning using experience replay in a vision-based robot problem. The results show the potential of this approach.",
keywords = "Recurrent neural networks, Reinforcement learning",
author = "Dung, {Le Tien} and Takashi Komeda and Motoki Takagi",
year = "2008",
doi = "10.1109/SICE.2008.4655239",
language = "English",
isbn = "9784907764296",
pages = "3327--3332",
booktitle = "Proceedings of the SICE Annual Conference",

}

TY - GEN

T1 - Efficient experience reuse in non-Markovian environments

AU - Dung, Le Tien

AU - Komeda, Takashi

AU - Takagi, Motoki

PY - 2008

Y1 - 2008

N2 - Learning time is always a critical issue in Reinforcement Learning, especially when Recurrent Neural Networks are used to predict Q values in non-Markovian environments. Experience reuse has been received much attention due to its ability to reduce learning time. In this paper, we propose a new method to efficiently reuse experience. Our method generates new episodes from recorded episodes using an action-pair merger. Recorded episodes and new episodes are replayed after each learning epoch. We compare our method with standard online learning, and learning using experience replay in a vision-based robot problem. The results show the potential of this approach.

AB - Learning time is always a critical issue in Reinforcement Learning, especially when Recurrent Neural Networks are used to predict Q values in non-Markovian environments. Experience reuse has been received much attention due to its ability to reduce learning time. In this paper, we propose a new method to efficiently reuse experience. Our method generates new episodes from recorded episodes using an action-pair merger. Recorded episodes and new episodes are replayed after each learning epoch. We compare our method with standard online learning, and learning using experience replay in a vision-based robot problem. The results show the potential of this approach.

KW - Recurrent neural networks

KW - Reinforcement learning

UR - http://www.scopus.com/inward/record.url?scp=56749173285&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=56749173285&partnerID=8YFLogxK

U2 - 10.1109/SICE.2008.4655239

DO - 10.1109/SICE.2008.4655239

M3 - Conference contribution

SN - 9784907764296

SP - 3327

EP - 3332

BT - Proceedings of the SICE Annual Conference

ER -