TOWARDS A UNIFIED STRATEGY FOR THE PREPROCESSING STEP IN DATA MINING

Camelia Vidrighin Bratu, Rodica Potolea

2009

Abstract

Data-related issues represent the main obstacle in obtaining a high quality data mining process. Existing strategies for preprocessing the available data usually focus on a single aspect, such as incompleteness, or dimensionality, or filtering out “harmful” attributes, etc. In this paper we propose a unified methodology for data preprocessing, which considers several aspects at the same time. The novelty of the approach consists in enhancing the data imputation step with information from the feature selection step, and performing both operations jointly, as two phases in the same activity. The methodology performs data imputation only on the attributes which are optimal for the class (from the feature selection point of view). Imputation is performed using machine learning methods. When imputing values for a given attribute, the optimal subset (of features) for that attribute is considered. The methodology is not restricted to the use of a particular technique, but can be applied using any existing data imputation and feature selection methods.

Download


Paper Citation


in Harvard Style

Vidrighin Bratu C. and Potolea R. (2009). TOWARDS A UNIFIED STRATEGY FOR THE PREPROCESSING STEP IN DATA MINING . In Proceedings of the 11th International Conference on Enterprise Information Systems - Volume 2: ICEIS, ISBN 978-989-8111-85-2, pages 230-235. DOI: 10.5220/0002008902300235

in Bibtex Style

@conference{iceis09,
author={Camelia Vidrighin Bratu and Rodica Potolea},
title={TOWARDS A UNIFIED STRATEGY FOR THE PREPROCESSING STEP IN DATA MINING},
booktitle={Proceedings of the 11th International Conference on Enterprise Information Systems - Volume 2: ICEIS,},
year={2009},
pages={230-235},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0002008902300235},
isbn={978-989-8111-85-2},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 11th International Conference on Enterprise Information Systems - Volume 2: ICEIS,
TI - TOWARDS A UNIFIED STRATEGY FOR THE PREPROCESSING STEP IN DATA MINING
SN - 978-989-8111-85-2
AU - Vidrighin Bratu C.
AU - Potolea R.
PY - 2009
SP - 230
EP - 235
DO - 10.5220/0002008902300235