A CAUTIOUS APPROACH TO GENERALIZATION IN REINFORCEMENT LEARNING

Raphael Fonteneau, Susan A. Murphy, Louis Wehenkel, Damien Ernst

2010

Abstract

In the context of a deterministic Lipschitz continuous environment over continuous state spaces, finite action spaces, and a finite optimization horizon, we propose an algorithm of polynomial complexity which exploits weak prior knowledge about its environment for computing from a given sample of trajectories and for a given initial state a sequence of actions. The proposed Viterbi-like algorithm maximizes a recently proposed lower bound on the return depending on the initial state, and uses to this end prior knowledge about the environment provided in the form of upper bounds on its Lipschitz constants. It thereby avoids, in way depending on the initial state and on the the prior knowledge, those regions of the state space where the sample is too sparse to make safe generalizations. Our experiments show that it can lead to more cautious policies than algorithms combining dynamic programming with function approximators. We give also a condition on the sample sparsity ensuring that, for a given initial state, the proposed algorithm produces an optimal sequence of actions in open-loop.

Download


Paper Citation


in Harvard Style

Fonteneau R., Murphy S., Wehenkel L. and Ernst D. (2010). A CAUTIOUS APPROACH TO GENERALIZATION IN REINFORCEMENT LEARNING . In Proceedings of the 2nd International Conference on Agents and Artificial Intelligence - Volume 1: ICAART, ISBN 978-989-674-021-4, pages 64-73. DOI: 10.5220/0002726900640073

in Bibtex Style

@conference{icaart10,
author={Raphael Fonteneau and Susan A. Murphy and Louis Wehenkel and Damien Ernst},
title={A CAUTIOUS APPROACH TO GENERALIZATION IN REINFORCEMENT LEARNING},
booktitle={Proceedings of the 2nd International Conference on Agents and Artificial Intelligence - Volume 1: ICAART,},
year={2010},
pages={64-73},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0002726900640073},
isbn={978-989-674-021-4},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 2nd International Conference on Agents and Artificial Intelligence - Volume 1: ICAART,
TI - A CAUTIOUS APPROACH TO GENERALIZATION IN REINFORCEMENT LEARNING
SN - 978-989-674-021-4
AU - Fonteneau R.
AU - Murphy S.
AU - Wehenkel L.
AU - Ernst D.
PY - 2010
SP - 64
EP - 73
DO - 10.5220/0002726900640073