Speeding up reinforcement learning using recurrent neural networks in non-Markovian environments

Tien Dung Le, Takashi Komeda, Motoki Takagi

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Reinforcement Learning (RL) has been widely used to solve problems with a little feedback from environment. Q learning can solve Markov Decision Processes quite well. For Partially Observable Markov Decision Processes, a Recurrent Neural Network (RNN) can be used to approximate Q values. However, learning time for these problems is typically very long. In this paper, we present a method to speed up learning performance in non-Markovian environments by focusing on necessary state-action pairs in learning episodes. Whenever the agent can attain the goal, the agent checks the episode and relearns necessary actions. We use a table, storing minimum number of appearances of states in all successful episodes, to remove unnecessary state-action pairs in a successful episode and to form a min-episode. To verify this method, we performed two experiments: The E maze problem with Time-delay Neural Network and the lighting grid world problem with Long Short Term Memory RNN. Experimental results show that the proposed method enables an agent to acquire a policy with better learning performance compared to the standard method.

Original languageEnglish
Title of host publicationProceedings of the 11th IASTED International Conference on Artificial Intelligence and Soft Computing, ASC 2007
Pages179-184
Number of pages6
Publication statusPublished - 2007
Event11th IASTED International Conference on Artificial Intelligence and Soft Computing, ASC 2007 - Palma de Mallorca, Spain
Duration: 2007 Aug 292007 Aug 31

Publication series

NameProceedings of the 11th IASTED International Conference on Artificial Intelligence and Soft Computing, ASC 2007

Conference

Conference11th IASTED International Conference on Artificial Intelligence and Soft Computing, ASC 2007
Country/TerritorySpain
CityPalma de Mallorca
Period07/8/2907/8/31

Keywords

  • And neural networks
  • Machine learning

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Science Applications
  • Software

Fingerprint

Dive into the research topics of 'Speeding up reinforcement learning using recurrent neural networks in non-Markovian environments'. Together they form a unique fingerprint.

Cite this