Motion planning of a mobile robot using reinforcement learning

Research output: Contribution to journalArticle

2 Citations (Scopus)

Abstract

In a previous paper, we proposed a solution to navigation of a mobile robot. In our approach, we formulated the following two problems at each time step as discrete optimization problems: 1) estimation of position and direction of a robot, and 2)action decision. While the results of our simulation showed the effectiveness of our approach, the values of weights in the objective functions were given by a heuristic method. This paper presents a theoretical method using reinforcement learning for adjusting the weight parameters in the objective function that includes pieces of heuristic knowledge on the action decision. In our reinforcement learning, the expectation of a reward given to a robot's trajectory is defined as the value function to maximize. The robot's trajectories are generated stochastically because we used a probabilistic policy for determining actions of a robot to search for the global optimal trajectory. However, this decision process is not a Markov decision process because the objective function includes an action at the previous time. Thus, Q-learning, which is a conventional method of reinforcement learning, cannot be applied to this problem. In this paper, we applied Williams's episodic REINFORCE approach to the action decision and derived a learning rule for the weight parameters of the objective function. Moreover, we applied the stochastic hill-climbing method to maximizing the value function to reduce computation time. The learning rule was verified by our experiment.

Original languageEnglish
Pages (from-to)501-509
Number of pages9
JournalTransactions of the Japanese Society for Artificial Intelligence
Volume16
Issue number6
DOIs
Publication statusPublished - 2001
Externally publishedYes

Fingerprint

Reinforcement learning
Motion planning
Mobile robots
Robots
Trajectories
Heuristic methods
Navigation
Experiments

Keywords

  • Discrete optimization problem
  • Mobile robot
  • Motion planning
  • Navigation
  • Reinforcement learning

ASJC Scopus subject areas

  • Artificial Intelligence

Cite this

@article{c9638f8ad79d47479cfc22cb15507c05,
title = "Motion planning of a mobile robot using reinforcement learning",
abstract = "In a previous paper, we proposed a solution to navigation of a mobile robot. In our approach, we formulated the following two problems at each time step as discrete optimization problems: 1) estimation of position and direction of a robot, and 2)action decision. While the results of our simulation showed the effectiveness of our approach, the values of weights in the objective functions were given by a heuristic method. This paper presents a theoretical method using reinforcement learning for adjusting the weight parameters in the objective function that includes pieces of heuristic knowledge on the action decision. In our reinforcement learning, the expectation of a reward given to a robot's trajectory is defined as the value function to maximize. The robot's trajectories are generated stochastically because we used a probabilistic policy for determining actions of a robot to search for the global optimal trajectory. However, this decision process is not a Markov decision process because the objective function includes an action at the previous time. Thus, Q-learning, which is a conventional method of reinforcement learning, cannot be applied to this problem. In this paper, we applied Williams's episodic REINFORCE approach to the action decision and derived a learning rule for the weight parameters of the objective function. Moreover, we applied the stochastic hill-climbing method to maximizing the value function to reduce computation time. The learning rule was verified by our experiment.",
keywords = "Discrete optimization problem, Mobile robot, Motion planning, Navigation, Reinforcement learning",
author = "Harukazu Igarashi",
year = "2001",
doi = "10.1527/tjsai.16.501",
language = "English",
volume = "16",
pages = "501--509",
journal = "Transactions of the Japanese Society for Artificial Intelligence",
issn = "1346-0714",
publisher = "Japanese Society for Artificial Intelligence",
number = "6",

}

TY - JOUR

T1 - Motion planning of a mobile robot using reinforcement learning

AU - Igarashi, Harukazu

PY - 2001

Y1 - 2001

N2 - In a previous paper, we proposed a solution to navigation of a mobile robot. In our approach, we formulated the following two problems at each time step as discrete optimization problems: 1) estimation of position and direction of a robot, and 2)action decision. While the results of our simulation showed the effectiveness of our approach, the values of weights in the objective functions were given by a heuristic method. This paper presents a theoretical method using reinforcement learning for adjusting the weight parameters in the objective function that includes pieces of heuristic knowledge on the action decision. In our reinforcement learning, the expectation of a reward given to a robot's trajectory is defined as the value function to maximize. The robot's trajectories are generated stochastically because we used a probabilistic policy for determining actions of a robot to search for the global optimal trajectory. However, this decision process is not a Markov decision process because the objective function includes an action at the previous time. Thus, Q-learning, which is a conventional method of reinforcement learning, cannot be applied to this problem. In this paper, we applied Williams's episodic REINFORCE approach to the action decision and derived a learning rule for the weight parameters of the objective function. Moreover, we applied the stochastic hill-climbing method to maximizing the value function to reduce computation time. The learning rule was verified by our experiment.

AB - In a previous paper, we proposed a solution to navigation of a mobile robot. In our approach, we formulated the following two problems at each time step as discrete optimization problems: 1) estimation of position and direction of a robot, and 2)action decision. While the results of our simulation showed the effectiveness of our approach, the values of weights in the objective functions were given by a heuristic method. This paper presents a theoretical method using reinforcement learning for adjusting the weight parameters in the objective function that includes pieces of heuristic knowledge on the action decision. In our reinforcement learning, the expectation of a reward given to a robot's trajectory is defined as the value function to maximize. The robot's trajectories are generated stochastically because we used a probabilistic policy for determining actions of a robot to search for the global optimal trajectory. However, this decision process is not a Markov decision process because the objective function includes an action at the previous time. Thus, Q-learning, which is a conventional method of reinforcement learning, cannot be applied to this problem. In this paper, we applied Williams's episodic REINFORCE approach to the action decision and derived a learning rule for the weight parameters of the objective function. Moreover, we applied the stochastic hill-climbing method to maximizing the value function to reduce computation time. The learning rule was verified by our experiment.

KW - Discrete optimization problem

KW - Mobile robot

KW - Motion planning

KW - Navigation

KW - Reinforcement learning

UR - http://www.scopus.com/inward/record.url?scp=18544386218&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=18544386218&partnerID=8YFLogxK

U2 - 10.1527/tjsai.16.501

DO - 10.1527/tjsai.16.501

M3 - Article

VL - 16

SP - 501

EP - 509

JO - Transactions of the Japanese Society for Artificial Intelligence

JF - Transactions of the Japanese Society for Artificial Intelligence

SN - 1346-0714

IS - 6

ER -