Behavior Learning Based on a Policy Gradient Method: Separation of Environmental Dynamics and State Values in Policies

Seiji Ishihara, Seiji Ishihara;Harukazu Igarashi

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)
Original languageEnglish
Pages (from-to)164-174
JournalPRICAI2008, Proceedings Lecture Notes in Computer Science
Volume5351
Publication statusPublished - 2008 Dec 19

Cite this