Kernel Hierarchical Agglomerative Clustering - Comparison of Different Gap Statistics to Estimate the Number of Clusters

Na Li, Nicolas Lefebvre, Régis Lengellé

2014

Abstract

Clustering algorithms, as unsupervised analysis tools, are useful for exploring data structure and have owned great success in many disciplines. For most of the clustering algorithms like k-means, determining the number of the clusters is a crucial step and is one of the most difficult problems. Hierarchical Agglomerative Clustering (HAC) has the advantage of giving a data representation by the dendrogram that allows clustering by cutting the dendrogram at some optimal level. In the past years and within the context of HAC, efficient statistics have been proposed to estimate the number of clusters and the Gap Statistic by Tibshirani has shown interesting performances. In this paper, we propose some new Gap Statistics to further improve the determination of the number of clusters. Our works focus on the kernelized version of the widely-used Hierarchical Clustering Algorithm.

Download


Paper Citation


in Harvard Style

Li N., Lefebvre N. and Lengellé R. (2014). Kernel Hierarchical Agglomerative Clustering - Comparison of Different Gap Statistics to Estimate the Number of Clusters . In Proceedings of the 3rd International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM, ISBN 978-989-758-018-5, pages 255-262. DOI: 10.5220/0004828202550262

in Bibtex Style

@conference{icpram14,
author={Na Li and Nicolas Lefebvre and Régis Lengellé},
title={Kernel Hierarchical Agglomerative Clustering - Comparison of Different Gap Statistics to Estimate the Number of Clusters},
booktitle={Proceedings of the 3rd International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM,},
year={2014},
pages={255-262},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004828202550262},
isbn={978-989-758-018-5},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 3rd International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM,
TI - Kernel Hierarchical Agglomerative Clustering - Comparison of Different Gap Statistics to Estimate the Number of Clusters
SN - 978-989-758-018-5
AU - Li N.
AU - Lefebvre N.
AU - Lengellé R.
PY - 2014
SP - 255
EP - 262
DO - 10.5220/0004828202550262