SIMPLE AND EFFICIENT PROJECTIVE CLUSTERING

Clark F. Olson, Henry J. Lyons

2010

Abstract

We describe a new Monte Carlo algorithm for projective clustering that is both simple and efficient. Like previous Monte Carlo algorithms, we perform trials that sample a small subset of the data points to determine the dimensions in which the points are sufficiently close to form a cluster and then search the rest of the data for data points that are part of the cluster. However, our algorithm differs from previous algorithms in the method by which the dimensions of the cluster are determined and the method for determining the points in the cluster. This allows us to use smaller subsets of the data to determine the cluster dimensions and achieve improved efficiency over previous algorithms. The complexity of our algorithm is O(nd1+loga/logb), where n is the number of data points, d is the number of dimensions in the space, and a and b are parameters that specify which clusters should be found. To our knowledge, this is the lowest published complexity for an algorithm that is able to place high bounds on the probability of success. We present experiments that show that our algorithm outperforms previous algorithms on real and synthetic data.

Download


Paper Citation


in Harvard Style

F. Olson C. and J. Lyons H. (2010). SIMPLE AND EFFICIENT PROJECTIVE CLUSTERING . In Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2010) ISBN 978-989-8425-28-7, pages 45-55. DOI: 10.5220/0003068400450055

in Bibtex Style

@conference{kdir10,
author={Clark F. Olson and Henry J. Lyons},
title={SIMPLE AND EFFICIENT PROJECTIVE CLUSTERING},
booktitle={Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2010)},
year={2010},
pages={45-55},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003068400450055},
isbn={978-989-8425-28-7},
}


in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2010)
TI - SIMPLE AND EFFICIENT PROJECTIVE CLUSTERING
SN - 978-989-8425-28-7
AU - F. Olson C.
AU - J. Lyons H.
PY - 2010
SP - 45
EP - 55
DO - 10.5220/0003068400450055