CONTINUOUS ACTION REINFORCEMENT LEARNING AUTOMATA - Performance and Convergence

Abdel Rodríguez; Ricardo Grau; Ann Nowé

doi:10.5220/0003287104730478

CONTINUOUS ACTION REINFORCEMENT LEARNING AUTOMATA - Performance and Convergence

Abdel Rodríguez, Ricardo Grau, Ann Nowé

2011

Abstract

Reinforcement Learning is a powerful technique for agents to solve unknown Markovian Decision Processes, from the possibly delayed signals that they receive. Most RL work, in particular for multi-agent settings, assume a discrete action set. Learning automata are reinforcement learners, belonging to the category of policy iterators, that exhibit nice convergence properties in discrete action settings. Unfortunately, most applications assume continuous actions. A formulation for a continuous action reinforcement learning automaton already exists, but there is no convergence guarantee to optimal decisions. An improve of the performance of the method is proposed in this paper as well as the proof for the local convergence.

Download

Paper Citation

in Harvard Style

Rodríguez A., Grau R. and Nowé A. (2011). CONTINUOUS ACTION REINFORCEMENT LEARNING AUTOMATA - Performance and Convergence . In Proceedings of the 3rd International Conference on Agents and Artificial Intelligence - Volume 2: ICAART, ISBN 978-989-8425-41-6, pages 473-478. DOI: 10.5220/0003287104730478

in Bibtex Style

@conference{icaart11,
author={Abdel Rodríguez and Ricardo Grau and Ann Nowé},
title={CONTINUOUS ACTION REINFORCEMENT LEARNING AUTOMATA - Performance and Convergence},
booktitle={Proceedings of the 3rd International Conference on Agents and Artificial Intelligence - Volume 2: ICAART,},
year={2011},
pages={473-478},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003287104730478},
isbn={978-989-8425-41-6},
}

in EndNote Style

TY - CONF
JO - Proceedings of the 3rd International Conference on Agents and Artificial Intelligence - Volume 2: ICAART,
TI - CONTINUOUS ACTION REINFORCEMENT LEARNING AUTOMATA - Performance and Convergence
SN - 978-989-8425-41-6
AU - Rodríguez A.
AU - Grau R.
AU - Nowé A.
PY - 2011
SP - 473
EP - 478
DO - 10.5220/0003287104730478