### Abstract

In a previous paper, we proposed a solution to path planning of a mobile robot. In our approach, we formulated the problem as a discrete optimization problem at each time step. To solve the optimization problem, we used an objective function consisting of a goal term, a smoothness term and a collision term. This paper presents a theoretical method using reinforcement learning for adjusting weight parameters in the objective functions. However, the conventional Q-learning method cannot be applied to a non-Markov decision process. Thus, we applied Williams's learning algorithm, REINFORCE, to derive an updating rule for the weight parameters. This is a stochastic hill-climbing method to maximize a value function. We verified the updating rule by experiment.

Original language | English |
---|---|

Title of host publication | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |

Pages | 315-320 |

Number of pages | 6 |

Volume | 2019 LNAI |

Publication status | Published - 2001 |

Externally published | Yes |

Event | 4th Robot World Cup Soccer Games and Conferences, RoboCup 2000 - Melbourne, VIC Duration: 2000 Aug 27 → 2000 Sep 3 |

### Publication series

Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|

Volume | 2019 LNAI |

ISSN (Print) | 03029743 |

ISSN (Electronic) | 16113349 |

### Other

Other | 4th Robot World Cup Soccer Games and Conferences, RoboCup 2000 |
---|---|

City | Melbourne, VIC |

Period | 00/8/27 → 00/9/3 |

### Fingerprint

### ASJC Scopus subject areas

- Computer Science(all)
- Theoretical Computer Science

### Cite this

*Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)*(Vol. 2019 LNAI, pp. 315-320). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 2019 LNAI).

**Path planning of a mobile robot as a discrete optimization problem and adjustment of weight parameters in the objective function by reinforcement learning.** / Igarashi, Harukazu.

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

*Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics).*vol. 2019 LNAI, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 2019 LNAI, pp. 315-320, 4th Robot World Cup Soccer Games and Conferences, RoboCup 2000, Melbourne, VIC, 00/8/27.

}

TY - GEN

T1 - Path planning of a mobile robot as a discrete optimization problem and adjustment of weight parameters in the objective function by reinforcement learning

AU - Igarashi, Harukazu

PY - 2001

Y1 - 2001

N2 - In a previous paper, we proposed a solution to path planning of a mobile robot. In our approach, we formulated the problem as a discrete optimization problem at each time step. To solve the optimization problem, we used an objective function consisting of a goal term, a smoothness term and a collision term. This paper presents a theoretical method using reinforcement learning for adjusting weight parameters in the objective functions. However, the conventional Q-learning method cannot be applied to a non-Markov decision process. Thus, we applied Williams's learning algorithm, REINFORCE, to derive an updating rule for the weight parameters. This is a stochastic hill-climbing method to maximize a value function. We verified the updating rule by experiment.

AB - In a previous paper, we proposed a solution to path planning of a mobile robot. In our approach, we formulated the problem as a discrete optimization problem at each time step. To solve the optimization problem, we used an objective function consisting of a goal term, a smoothness term and a collision term. This paper presents a theoretical method using reinforcement learning for adjusting weight parameters in the objective functions. However, the conventional Q-learning method cannot be applied to a non-Markov decision process. Thus, we applied Williams's learning algorithm, REINFORCE, to derive an updating rule for the weight parameters. This is a stochastic hill-climbing method to maximize a value function. We verified the updating rule by experiment.

UR - http://www.scopus.com/inward/record.url?scp=84867460364&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84867460364&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:84867460364

SN - 3540421858

SN - 9783540421856

VL - 2019 LNAI

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 315

EP - 320

BT - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

ER -