SELECTING CATEGORICAL FEATURES IN MODEL-BASED CLUSTERING

Cláudia M. V. Silvestre, Margarida M. G. Cardoso, Mario A. T. Figueiredo

2009

Abstract

There has been relatively little research on feature/variable selection in unsupervised clustering. In fact, feature selection for clustering is a challenging task due to the absence of class labels for guiding the search for relevant features. The methods proposed for addressing this problem are mostly focused on numerical data. In this work, we propose an approach to selecting categorical features in clustering. We assume that the data comes from a finite mixture of multinomial distributions and implement a new expectation-maximization (EM) algorithm that estimate the parameters of the model and selects the relevant variables. The results obtained on synthetic data clearly illustrate the capability of the proposed approach to select the relevant features.

Download


Paper Citation


in Harvard Style

M. V. Silvestre C., M. G. Cardoso M. and A. T. Figueiredo M. (2009). SELECTING CATEGORICAL FEATURES IN MODEL-BASED CLUSTERING . In Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2009) ISBN 978-989-674-011-5, pages 303-306. DOI: 10.5220/0002303203030306

in Bibtex Style

@conference{kdir09,
author={Cláudia M. V. Silvestre and Margarida M. G. Cardoso and Mario A. T. Figueiredo},
title={SELECTING CATEGORICAL FEATURES IN MODEL-BASED CLUSTERING},
booktitle={Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2009)},
year={2009},
pages={303-306},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0002303203030306},
isbn={978-989-674-011-5},
}


in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2009)
TI - SELECTING CATEGORICAL FEATURES IN MODEL-BASED CLUSTERING
SN - 978-989-674-011-5
AU - M. V. Silvestre C.
AU - M. G. Cardoso M.
AU - A. T. Figueiredo M.
PY - 2009
SP - 303
EP - 306
DO - 10.5220/0002303203030306