### 抜粋

Reinforcement Learning has been widely used to solve problems with a little feedback from environment. Q learning can solve full observable Markov Decision Processes quite well. For Partially Observable Markov Decision Processes (POMDPs), a Recurrent Neural Network (RNN) can be used to approximate Q values. However, learning time for these problems is typically very long. In this paper, Mixed Reinforcement Learning is presented to And an optimal policy for POMDPs in a shorter learning time. This method uses both a Q value table and a RNN. Q value table stores Q values for full observable states and the RNN approximates Q values for hidden states. An observable degree is calculated for each state while the agent explores the environment. If the observable degree is less than a threshold, the state is considered as a hidden state. Results of experiment in lighting grid world problem show that the proposed method enables an agent to acquire a policy, as good as the policy acquired by using only a RNN, with better learning performance.

元の言語 | English |
---|---|

ホスト出版物のタイトル | Proceedings of the 2007 IEEE International Symposium on Computational Intelligence in Robotics and Automation, CIRA 2007 |

ページ | 7-12 |

ページ数 | 6 |

DOI | |

出版物ステータス | Published - 2007 10 9 |

イベント | 2007 IEEE International Symposium on Computational Intelligence in Robotics and Automation, CIRA 2007 - Jacksonville, FL, United States 継続期間: 2007 6 20 → 2007 6 23 |

### 出版物シリーズ

名前 | Proceedings of the 2007 IEEE International Symposium on Computational Intelligence in Robotics and Automation, CIRA 2007 |
---|

### Conference

Conference | 2007 IEEE International Symposium on Computational Intelligence in Robotics and Automation, CIRA 2007 |
---|---|

国 | United States |

市 | Jacksonville, FL |

期間 | 07/6/20 → 07/6/23 |

### ASJC Scopus subject areas

- Artificial Intelligence
- Software
- Control and Systems Engineering
- Electrical and Electronic Engineering

## フィンガープリント Mixed reinforcement learning for partially observable Markov decision process' の研究トピックを掘り下げます。これらはともに一意のフィンガープリントを構成します。

## これを引用

*Proceedings of the 2007 IEEE International Symposium on Computational Intelligence in Robotics and Automation, CIRA 2007*(pp. 7-12). [4269910] (Proceedings of the 2007 IEEE International Symposium on Computational Intelligence in Robotics and Automation, CIRA 2007). https://doi.org/10.1109/CIRA.2007.382910