Measuring Cluster Similarity by the Travel Time between Data Points

Yonggang Lu, Xiaoli Hou, Xurong Chen

2014

Abstract

A new similarity measure for hierarchical clustering is proposed. The idea is to treat all the data points as mass points under a hypothetical gravitational force field, and derive the hierarchical clustering results by estimating the travel time between data points. The shorter the time needed to travel from one point to another, the more similar the two data points are. In order to avoid the complexity in the simulation using molecular dynamics, the potential field produced by all the data points is computed. Then the travel time between a pair of data points is estimated using the potential field. In our method, the travel time is used to construct a new similarity measure, and an edge-weighted tree of all the data points is built to improve the efficiency of the hierarchical clustering. The proposed method called Travel-Time based Hierarchical Clustering (TTHC) is evaluated by comparing with four other hierarchical clustering methods. Two real datasets and two synthetic dataset families composed of 200 randomly produced datasets are used in our Experiments. It is shown that the TTHC method can produce very competitive results, and using the estimated travel time instead of the distance between data points is capable of improving the robustness and the quality of clustering.

Download


Paper Citation


in Harvard Style

Lu Y., Hou X. and Chen X. (2014). Measuring Cluster Similarity by the Travel Time between Data Points . In Proceedings of the 3rd International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM, ISBN 978-989-758-018-5, pages 14-20. DOI: 10.5220/0004761800140020

in Bibtex Style

@conference{icpram14,
author={Yonggang Lu and Xiaoli Hou and Xurong Chen},
title={Measuring Cluster Similarity by the Travel Time between Data Points},
booktitle={Proceedings of the 3rd International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM,},
year={2014},
pages={14-20},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004761800140020},
isbn={978-989-758-018-5},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 3rd International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM,
TI - Measuring Cluster Similarity by the Travel Time between Data Points
SN - 978-989-758-018-5
AU - Lu Y.
AU - Hou X.
AU - Chen X.
PY - 2014
SP - 14
EP - 20
DO - 10.5220/0004761800140020