AN OPTIMAL EVALUATION OF GROUPBY-JOIN QUERIES IN DISTRIBUTED ARCHITECTURES

M. Al Hajj Hassan, M. Bamha

2007

Abstract

SQL queries involving join and group-by operations are fairly common in many decision support applications where the size of the input relations is usually very large, so the parallelization of these queries is highly recommended in order to obtain a desirable response time. The most significant drawbacks of the algorithms presented in the literature for treating such queries are that they are very sensitive to data skew and involve expansive communication and Input/Output costs in the evaluation of the join operation. In this paper, we present an algorithm that overcomes these drawbacks because it evaluates the ”GroupBy-Join” query without the need of the direct evaluation of the costly join operation, thus reducing its Input/Output and communication costs. Furthermore, the performance of this algorithm is analyzed using the scalable and portable BSP (Bulk Synchronous Parallel) cost model which predicts a linear speedup even for highly skewed data.

Download


Paper Citation


in Harvard Style

Al Hajj Hassan M. and Bamha M. (2007). AN OPTIMAL EVALUATION OF GROUPBY-JOIN QUERIES IN DISTRIBUTED ARCHITECTURES . In Proceedings of the Third International Conference on Web Information Systems and Technologies - Volume 1: WEBIST, ISBN 978-972-8865-77-1, pages 246-252. DOI: 10.5220/0001281302460252

in Bibtex Style

@conference{webist07,
author={M. Al Hajj Hassan and M. Bamha},
title={AN OPTIMAL EVALUATION OF GROUPBY-JOIN QUERIES IN DISTRIBUTED ARCHITECTURES},
booktitle={Proceedings of the Third International Conference on Web Information Systems and Technologies - Volume 1: WEBIST,},
year={2007},
pages={246-252},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0001281302460252},
isbn={978-972-8865-77-1},
}


in EndNote Style

TY - CONF
JO - Proceedings of the Third International Conference on Web Information Systems and Technologies - Volume 1: WEBIST,
TI - AN OPTIMAL EVALUATION OF GROUPBY-JOIN QUERIES IN DISTRIBUTED ARCHITECTURES
SN - 978-972-8865-77-1
AU - Al Hajj Hassan M.
AU - Bamha M.
PY - 2007
SP - 246
EP - 252
DO - 10.5220/0001281302460252