Teaching robots behavior patterns by using reinforcement learning: How to raise pet robots with a remote control

Mans Ullerstam, Makoto Mizukawa

Research output: Contribution to conferencePaper

3 Citations (Scopus)

Abstract

The goal of this project was to show that complex behavior patterns can be learnt by a system based on reinforcement learning. The specific task was to make AIBO, the Sony robot dog, learn complex behavior patterns based on interactions between humans and AIBO. The reinforcement learning system is taught by remote control, used by the human and connected to AIBO. To remember the learnt behavior sequences, a short-term memory of prior actions is used by AIBO. This paper demonstrates that it is possible to learn behavior sequences and the relationship of cause and effect in complex environments. The paper also shows that the system works in a natural environment, based on the interaction between humans and AIBO, learning the rewards and the means to reach them in parallel. AIBO is also able to pick up new behaviors instantly by using a method we call 'Instant learning'. The paper presents the methods for implementing such a system.

Original languageEnglish
Pages2251-2254
Number of pages4
Publication statusPublished - 2004 Dec 1
EventSICE Annual Conference 2004 - Sapporo, Japan
Duration: 2004 Aug 42004 Aug 6

Conference

ConferenceSICE Annual Conference 2004
CountryJapan
CitySapporo
Period04/8/404/8/6

Keywords

  • AIBO
  • Reinforcment learning
  • Remote control
  • User demonstration

ASJC Scopus subject areas

  • Control and Systems Engineering
  • Computer Science Applications
  • Electrical and Electronic Engineering

Fingerprint Dive into the research topics of 'Teaching robots behavior patterns by using reinforcement learning: How to raise pet robots with a remote control'. Together they form a unique fingerprint.

  • Cite this

    Ullerstam, M., & Mizukawa, M. (2004). Teaching robots behavior patterns by using reinforcement learning: How to raise pet robots with a remote control. 2251-2254. Paper presented at SICE Annual Conference 2004, Sapporo, Japan.