DEALING WITH “VERY LARGE” DATASETS - An Overview of a Promising Research Line: Distributed Learning

Diego Peteiro-Barral, Bertha Guijarro-Berdiñas, Beatriz Pérez-Sánchez

2011

Abstract

Traditionally, a bottleneck preventing the development of more intelligent systems was the limited amount of data available. However, nowadays in many domains of machine learning, the size of the datasets is so large that the limiting factor is the inability of learning algorithms to use all the data to learn with in a reasonable time. In order to handle this problem a new field in machine learning has emerged: large-scale learning, where learning is limited by computational resources rather than by the availability of data. Moreover, in many real applications, “very large” datasets are naturally distributed and it is necessary to learn locally in each of the workstations in which the data are generated. However, the great majority of well-known learning algorithms do not provide an admissible solution to both problems: learning from “very large” datasets and learning from distributed data. In this context, distributed learning seems to be a promising line of research with which to deal with both situations, since “very large” concentrated datasets can be partitioned among several workstations. This paper provides some background regarding distributed environments as well as an overview of distributed learning for dealing with “very large” datasets.

Download


Paper Citation


in Harvard Style

Peteiro-Barral D., Guijarro-Berdiñas B. and Pérez-Sánchez B. (2011). DEALING WITH “VERY LARGE” DATASETS - An Overview of a Promising Research Line: Distributed Learning . In Proceedings of the 3rd International Conference on Agents and Artificial Intelligence - Volume 1: ICAART, ISBN 978-989-8425-40-9, pages 476-481. DOI: 10.5220/0003288804760481

in Bibtex Style

@conference{icaart11,
author={Diego Peteiro-Barral and Bertha Guijarro-Berdiñas and Beatriz Pérez-Sánchez},
title={DEALING WITH “VERY LARGE” DATASETS - An Overview of a Promising Research Line: Distributed Learning},
booktitle={Proceedings of the 3rd International Conference on Agents and Artificial Intelligence - Volume 1: ICAART,},
year={2011},
pages={476-481},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003288804760481},
isbn={978-989-8425-40-9},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 3rd International Conference on Agents and Artificial Intelligence - Volume 1: ICAART,
TI - DEALING WITH “VERY LARGE” DATASETS - An Overview of a Promising Research Line: Distributed Learning
SN - 978-989-8425-40-9
AU - Peteiro-Barral D.
AU - Guijarro-Berdiñas B.
AU - Pérez-Sánchez B.
PY - 2011
SP - 476
EP - 481
DO - 10.5220/0003288804760481