A Data-aware MultiWorkflow Scheduler for Clusters on WorkflowSim

César Acevedo, Porfidio Hernández, Antonio Espinosa, Victor Mendez

2017

Abstract

Most scientific workflows are defined as Direct Acyclic Graphs. Despite DAGs are very expressive to reflect dependencies relationships, current approaches are not aware of the storage physiognomy in terms of performance and capacity. Provide information about temporal storage allocation on data intensive applications helps to avoid performance issues. Nevertheless, we need to evaluate several combinations of data file locations and application scheduling. Simulation is one of the most popular evaluation methods in scientific workflow execution to develop new storage-aware scheduling techniques or improve existing ones, to test scalability and repetitiveness. This paper presents a multiworkflow store-aware scheduler policy as an extension of WorkflowSim, enabling its combination with other WorkflowSim scheduling policies and the possibility of evaluating a wide range of storage and file allocation possibilities. This paper also presents a proof of concept of a real world implementation of a storage-aware scheduler to validate the accuracy of the WorkflowSim extension and the scalability of our scheduler technique. The evaluation on several environments shows promising results up to 69% of makespan improvement on simulated large scale clusters with an error of the WorflowSim extension between 0,9% and 3% comparing with the real infrastructure implementation.

Download


Paper Citation


in Harvard Style

Acevedo C., Hernández P., Espinosa A. and Mendez V. (2017). A Data-aware MultiWorkflow Scheduler for Clusters on WorkflowSim . In Proceedings of the 2nd International Conference on Complexity, Future Information Systems and Risk - Volume 1: COMPLEXIS, ISBN 978-989-758-244-8, pages 79-86. DOI: 10.5220/0006303500790086

in Bibtex Style

@conference{complexis17,
author={César Acevedo and Porfidio Hernández and Antonio Espinosa and Victor Mendez},
title={A Data-aware MultiWorkflow Scheduler for Clusters on WorkflowSim},
booktitle={Proceedings of the 2nd International Conference on Complexity, Future Information Systems and Risk - Volume 1: COMPLEXIS,},
year={2017},
pages={79-86},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0006303500790086},
isbn={978-989-758-244-8},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 2nd International Conference on Complexity, Future Information Systems and Risk - Volume 1: COMPLEXIS,
TI - A Data-aware MultiWorkflow Scheduler for Clusters on WorkflowSim
SN - 978-989-758-244-8
AU - Acevedo C.
AU - Hernández P.
AU - Espinosa A.
AU - Mendez V.
PY - 2017
SP - 79
EP - 86
DO - 10.5220/0006303500790086