Learning of soccer player agents using a policy gradient method: Coordination between kicker and receiver during free kicks

H. Igarashi, K. Nakamura, S. Ishihara

Research output: Chapter in Book/Report/Conference proceedingConference contribution

5 Citations (Scopus)

Abstract

The RoboCup Simulation League is recognized as a test bed for research on multi-agent learning. As an example of multi-agent learning in a soccer game, we dealt with a learning problem between a kicker and a receiver when a direct free kick is awarded just outside the opponent's penalty area. In such a situation, to which point should the kicker kick the ball? We propose a function that expresses heuristics to evaluate an advantageous target point for safely sending/receiving a pass and scoring. The heuristics includes an interaction term between a kicker and a receiver to intensify their coordination. To calculate the interaction term, we let kicker/receiver agents have a receiver/kicker action decision model to predict his teammate's action. The evaluation function makes it possible to handle a large space of states consisting of the positions of a kicker, a receiver, and their opponents. The target point of the free kick is selected by the kicker using Boltzmann selection with an evaluation function. Parameters in the function can be learned by a kind of reinforcement learning called the policy gradient method. The point to which a receiver should run to receive the ball is simultaneously learned in the same manner. The effectiveness of our solution was shown by experiments.

Original languageEnglish
Title of host publication2008 International Joint Conference on Neural Networks, IJCNN 2008
Pages46-52
Number of pages7
DOIs
Publication statusPublished - 2008
Event2008 International Joint Conference on Neural Networks, IJCNN 2008 - Hong Kong, China
Duration: 2008 Jun 12008 Jun 8

Publication series

NameProceedings of the International Joint Conference on Neural Networks

Conference

Conference2008 International Joint Conference on Neural Networks, IJCNN 2008
Country/TerritoryChina
CityHong Kong
Period08/6/108/6/8

ASJC Scopus subject areas

  • Software
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Learning of soccer player agents using a policy gradient method: Coordination between kicker and receiver during free kicks'. Together they form a unique fingerprint.

Cite this