Mixed reinforcement learning for partially observable Markov decision process

Le Tien Dung, Takashi Komeda, Motoki Takagi

研究成果: Conference contribution

5 引用 (Scopus)

抜粋

Reinforcement Learning has been widely used to solve problems with a little feedback from environment. Q learning can solve full observable Markov Decision Processes quite well. For Partially Observable Markov Decision Processes (POMDPs), a Recurrent Neural Network (RNN) can be used to approximate Q values. However, learning time for these problems is typically very long. In this paper, Mixed Reinforcement Learning is presented to And an optimal policy for POMDPs in a shorter learning time. This method uses both a Q value table and a RNN. Q value table stores Q values for full observable states and the RNN approximates Q values for hidden states. An observable degree is calculated for each state while the agent explores the environment. If the observable degree is less than a threshold, the state is considered as a hidden state. Results of experiment in lighting grid world problem show that the proposed method enables an agent to acquire a policy, as good as the policy acquired by using only a RNN, with better learning performance.

元の言語English
ホスト出版物のタイトルProceedings of the 2007 IEEE International Symposium on Computational Intelligence in Robotics and Automation, CIRA 2007
ページ7-12
ページ数6
DOI
出版物ステータスPublished - 2007 10 9
イベント2007 IEEE International Symposium on Computational Intelligence in Robotics and Automation, CIRA 2007 - Jacksonville, FL, United States
継続期間: 2007 6 202007 6 23

出版物シリーズ

名前Proceedings of the 2007 IEEE International Symposium on Computational Intelligence in Robotics and Automation, CIRA 2007

Conference

Conference2007 IEEE International Symposium on Computational Intelligence in Robotics and Automation, CIRA 2007
United States
Jacksonville, FL
期間07/6/2007/6/23

ASJC Scopus subject areas

  • Artificial Intelligence
  • Software
  • Control and Systems Engineering
  • Electrical and Electronic Engineering

フィンガープリント Mixed reinforcement learning for partially observable Markov decision process' の研究トピックを掘り下げます。これらはともに一意のフィンガープリントを構成します。

  • これを引用

    Dung, L. T., Komeda, T., & Takagi, M. (2007). Mixed reinforcement learning for partially observable Markov decision process. : Proceedings of the 2007 IEEE International Symposium on Computational Intelligence in Robotics and Automation, CIRA 2007 (pp. 7-12). [4269910] (Proceedings of the 2007 IEEE International Symposium on Computational Intelligence in Robotics and Automation, CIRA 2007). https://doi.org/10.1109/CIRA.2007.382910