OPTIMAL SAMPLE SELECTION FOR BATCH-MODE REINFORCEMENT LEARNING

Emmanuel Rachelson, François Schnitzler, Louis Wehenkel, Damien Ernst

2011

Abstract

We introduce the Optimal Sample Selection (OSS) meta-algorithm for solving discrete-time Optimal Control problems. This meta- algorithm maps the problem of finding a near-optimal closed-loop policy to the identification of a small set of one-step system transitions, leading to high-quality policies when used as input of a batch-mode Reinforcement Learning (RL) algorithm. We detail a particular instance of this OSS meta-algorithm that uses tree-based Fitted Q-Iteration as a batch-mode RL algorithm and Cross Entropy search as a method for navigating efficiently in the space of sample sets. The results show that this particular instance of OSS algorithms is able to identify rapidly small sample sets leading to high-quality policies.

Download


Paper Citation


in Harvard Style

Rachelson E., Schnitzler F., Wehenkel L. and Ernst D. (2011). OPTIMAL SAMPLE SELECTION FOR BATCH-MODE REINFORCEMENT LEARNING . In Proceedings of the 3rd International Conference on Agents and Artificial Intelligence - Volume 1: ICAART, ISBN 978-989-8425-40-9, pages 41-50. DOI: 10.5220/0003133500410050

in Bibtex Style

@conference{icaart11,
author={Emmanuel Rachelson and François Schnitzler and Louis Wehenkel and Damien Ernst},
title={OPTIMAL SAMPLE SELECTION FOR BATCH-MODE REINFORCEMENT LEARNING},
booktitle={Proceedings of the 3rd International Conference on Agents and Artificial Intelligence - Volume 1: ICAART,},
year={2011},
pages={41-50},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003133500410050},
isbn={978-989-8425-40-9},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 3rd International Conference on Agents and Artificial Intelligence - Volume 1: ICAART,
TI - OPTIMAL SAMPLE SELECTION FOR BATCH-MODE REINFORCEMENT LEARNING
SN - 978-989-8425-40-9
AU - Rachelson E.
AU - Schnitzler F.
AU - Wehenkel L.
AU - Ernst D.
PY - 2011
SP - 41
EP - 50
DO - 10.5220/0003133500410050