Context-aware MapReduce for Geo-distributed Big Data

Marco Cavallo, Giuseppe Di Modica, Carmelo Polito, Orazio Tomarchio

2015

Abstract

MapReduce is an effective distributed programming model used in cloud computing for large-scale data analysis applications. Hadoop, the most known and used open-source implementation of the MapReduce model, assumes that every node in a cluster has the same computing capacity and that data are local to tasks. However, in many real big data applications where data may be located in many datacenters distributed over the planet these assumptions do not hold any longer, thus affecting Hadoop performance. This paper addresses this point, by proposing a hierarchical MapReduce programming model where a toplevel scheduling system is aware of the underlying computing contexts heterogeneity. The main idea of the approach is to improve the job processing time by partitioning and redistributing the workload among geo-distributed workers: this is done by adequately monitoring the bottom-level computing and networking context.

Download


Paper Citation


in Harvard Style

Cavallo M., Di Modica G., Polito C. and Tomarchio O. (2015). Context-aware MapReduce for Geo-distributed Big Data . In Proceedings of the 5th International Conference on Cloud Computing and Services Science - Volume 1: CLOSER, ISBN 978-989-758-104-5, pages 414-421. DOI: 10.5220/0005497704140421

in Bibtex Style

@conference{closer15,
author={Marco Cavallo and Giuseppe Di Modica and Carmelo Polito and Orazio Tomarchio},
title={Context-aware MapReduce for Geo-distributed Big Data},
booktitle={Proceedings of the 5th International Conference on Cloud Computing and Services Science - Volume 1: CLOSER,},
year={2015},
pages={414-421},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005497704140421},
isbn={978-989-758-104-5},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 5th International Conference on Cloud Computing and Services Science - Volume 1: CLOSER,
TI - Context-aware MapReduce for Geo-distributed Big Data
SN - 978-989-758-104-5
AU - Cavallo M.
AU - Di Modica G.
AU - Polito C.
AU - Tomarchio O.
PY - 2015
SP - 414
EP - 421
DO - 10.5220/0005497704140421