OPTIMAL SAMPLE SELECTION FOR BATCH-MODE REINFORCEMENT LEARNING
Emmanuel Rachelson, François Schnitzler, Louis Wehenkel, Damien Ernst
2011
Abstract
We introduce the Optimal Sample Selection (OSS) meta-algorithm for solving discrete-time Optimal Control problems. This meta- algorithm maps the problem of finding a near-optimal closed-loop policy to the identification of a small set of one-step system transitions, leading to high-quality policies when used as input of a batch-mode Reinforcement Learning (RL) algorithm. We detail a particular instance of this OSS meta-algorithm that uses tree-based Fitted Q-Iteration as a batch-mode RL algorithm and Cross Entropy search as a method for navigating efficiently in the space of sample sets. The results show that this particular instance of OSS algorithms is able to identify rapidly small sample sets leading to high-quality policies.
DownloadPaper Citation
in Harvard Style
Rachelson E., Schnitzler F., Wehenkel L. and Ernst D. (2011). OPTIMAL SAMPLE SELECTION FOR BATCH-MODE REINFORCEMENT LEARNING . In Proceedings of the 3rd International Conference on Agents and Artificial Intelligence - Volume 1: ICAART, ISBN 978-989-8425-40-9, pages 41-50. DOI: 10.5220/0003133500410050
in Bibtex Style
@conference{icaart11,
author={Emmanuel Rachelson and François Schnitzler and Louis Wehenkel and Damien Ernst},
title={OPTIMAL SAMPLE SELECTION FOR BATCH-MODE REINFORCEMENT LEARNING},
booktitle={Proceedings of the 3rd International Conference on Agents and Artificial Intelligence - Volume 1: ICAART,},
year={2011},
pages={41-50},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003133500410050},
isbn={978-989-8425-40-9},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 3rd International Conference on Agents and Artificial Intelligence - Volume 1: ICAART,
TI - OPTIMAL SAMPLE SELECTION FOR BATCH-MODE REINFORCEMENT LEARNING
SN - 978-989-8425-40-9
AU - Rachelson E.
AU - Schnitzler F.
AU - Wehenkel L.
AU - Ernst D.
PY - 2011
SP - 41
EP - 50
DO - 10.5220/0003133500410050