Reducing Sample Complexity in Reinforcement Learning by
Transferring Transition and Reward Probabilities

Kouta Oguni; Kazuyuki Narisawa; Ayumi Shinohara

doi:10.5220/0004915606320638

Reducing Sample Complexity in Reinforcement Learning by Transferring Transition and Reward Probabilities

Kouta Oguni, Kazuyuki Narisawa, Ayumi Shinohara

2014

Abstract

Most existing reinforcement learning algorithms require many trials until they obtain optimal policies. In this study, we apply transfer learning to reinforcement learning to realize greater efficiency. We propose a new algorithm called TR-MAX, based on the R-MAX algorithm. TR-MAX transfers the transition and reward probabilities from a source task to a target task as prior knowledge. We theoretically analyze the sample complexity of TR-MAX. Moreover, we show that TR-MAX performs much better in practice than R-MAX in maze tasks.

Download

Paper Citation

in Harvard Style

Oguni K., Narisawa K. and Shinohara A. (2014). Reducing Sample Complexity in Reinforcement Learning by Transferring Transition and Reward Probabilities . In Proceedings of the 6th International Conference on Agents and Artificial Intelligence - Volume 1: ICAART, ISBN 978-989-758-015-4, pages 632-638. DOI: 10.5220/0004915606320638

in Bibtex Style

@conference{icaart14,
author={Kouta Oguni and Kazuyuki Narisawa and Ayumi Shinohara},
title={Reducing Sample Complexity in Reinforcement Learning by Transferring Transition and Reward Probabilities},
booktitle={Proceedings of the 6th International Conference on Agents and Artificial Intelligence - Volume 1: ICAART,},
year={2014},
pages={632-638},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004915606320638},
isbn={978-989-758-015-4},
}

in EndNote Style

TY - CONF
JO - Proceedings of the 6th International Conference on Agents and Artificial Intelligence - Volume 1: ICAART,
TI - Reducing Sample Complexity in Reinforcement Learning by Transferring Transition and Reward Probabilities
SN - 978-989-758-015-4
AU - Oguni K.
AU - Narisawa K.
AU - Shinohara A.
PY - 2014
SP - 632
EP - 638
DO - 10.5220/0004915606320638