Behavior Learning Based on a Policy Gradient Method: Separation of Environmental Dynamics and State Values in Policies

Research output: Contribution to journalArticle

1 Citation (Scopus)
Original languageEnglish
Pages (from-to)164-174
JournalPRICAI2008, Proceedings Lecture Notes in Computer Science
Volume5351
Publication statusPublished - 2008 Dec 19

Cite this