A Semi-supervised Learning Framework to Cluster Mixed Data Types

Artur Abdullin, Olfa Nasraoui

2012

Abstract

We propose a semi-supervised framework to handle diverse data formats or data with mixed-type attributes. Our preliminary results in clustering data with mixed numerical and categorical attributes show that the proposed semi-supervised framework gives better clustering results in the categorical domain. Thus the seeds obtained from clustering the numerical domain give an additional knowledge to the categorical clustering algorithm. Additional results show that our approach has the potential to outperform clustering either domain on its own or clustering both domains after converting them to the same target domain.

Download


Paper Citation


in Harvard Style

Abdullin A. and Nasraoui O. (2012). A Semi-supervised Learning Framework to Cluster Mixed Data Types . In Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2012) ISBN 978-989-8565-29-7, pages 45-54. DOI: 10.5220/0004134300450054

in Bibtex Style

@conference{kdir12,
author={Artur Abdullin and Olfa Nasraoui},
title={A Semi-supervised Learning Framework to Cluster Mixed Data Types},
booktitle={Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2012)},
year={2012},
pages={45-54},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004134300450054},
isbn={978-989-8565-29-7},
}


in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2012)
TI - A Semi-supervised Learning Framework to Cluster Mixed Data Types
SN - 978-989-8565-29-7
AU - Abdullin A.
AU - Nasraoui O.
PY - 2012
SP - 45
EP - 54
DO - 10.5220/0004134300450054