Benchmarking Hadoop Performance in the Cloud - An in Depth Study of Resource Management and Energy Consumption

Aymen Jlassi, Patrick Martineau

2016

Abstract

Virtual technologies have proven their capabilities to ensure good performance in the context of high performance computing (HPC). During the last decade, the big data tools have been emerging, they have their own needs in performance and infrastructure. Having a wide breadth of experience in the HPC domain, the experts can evaluate the infrastructures used to run big data tools easily. The outcome of this paper is the evaluation of two technologies of virtualization in the context of big data tools. We compare the performance and the energy consumption of two technologies of virtualization (Docker containers and VMware) and benchmark the software Hadoop (JoshBaer, 2015) using these environments. Firstly, the aim is the reduction of the Hadoop deployment cost using the cloud. Secondly, we discuss and analyze the assumptions learned from the HPC experiments and their applicability in the big data context. Thirdly, the Hadoop community finds an in-depth study of the resource consumption depending on the deployment environment. We come to the point that the use of the Docker container gives better performance in most experiments. Besides, the energy consumption varies according to the executed workload.

Download


Paper Citation


in Harvard Style

Jlassi A. and Martineau P. (2016). Benchmarking Hadoop Performance in the Cloud - An in Depth Study of Resource Management and Energy Consumption . In Proceedings of the 6th International Conference on Cloud Computing and Services Science - Volume 2: CLOSER, ISBN 978-989-758-182-3, pages 192-201. DOI: 10.5220/0005861701920201

in Bibtex Style

@conference{closer16,
author={Aymen Jlassi and Patrick Martineau},
title={Benchmarking Hadoop Performance in the Cloud - An in Depth Study of Resource Management and Energy Consumption},
booktitle={Proceedings of the 6th International Conference on Cloud Computing and Services Science - Volume 2: CLOSER,},
year={2016},
pages={192-201},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005861701920201},
isbn={978-989-758-182-3},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 6th International Conference on Cloud Computing and Services Science - Volume 2: CLOSER,
TI - Benchmarking Hadoop Performance in the Cloud - An in Depth Study of Resource Management and Energy Consumption
SN - 978-989-758-182-3
AU - Jlassi A.
AU - Martineau P.
PY - 2016
SP - 192
EP - 201
DO - 10.5220/0005861701920201