TOWARDS A UNIFIED STRATEGY FOR THE PREPROCESSING STEP IN DATA MINING
Camelia Vidrighin Bratu, Rodica Potolea
2009
Abstract
Data-related issues represent the main obstacle in obtaining a high quality data mining process. Existing strategies for preprocessing the available data usually focus on a single aspect, such as incompleteness, or dimensionality, or filtering out “harmful” attributes, etc. In this paper we propose a unified methodology for data preprocessing, which considers several aspects at the same time. The novelty of the approach consists in enhancing the data imputation step with information from the feature selection step, and performing both operations jointly, as two phases in the same activity. The methodology performs data imputation only on the attributes which are optimal for the class (from the feature selection point of view). Imputation is performed using machine learning methods. When imputing values for a given attribute, the optimal subset (of features) for that attribute is considered. The methodology is not restricted to the use of a particular technique, but can be applied using any existing data imputation and feature selection methods.
DownloadPaper Citation
in Harvard Style
Vidrighin Bratu C. and Potolea R. (2009). TOWARDS A UNIFIED STRATEGY FOR THE PREPROCESSING STEP IN DATA MINING . In Proceedings of the 11th International Conference on Enterprise Information Systems - Volume 2: ICEIS, ISBN 978-989-8111-85-2, pages 230-235. DOI: 10.5220/0002008902300235
in Bibtex Style
@conference{iceis09,
author={Camelia Vidrighin Bratu and Rodica Potolea},
title={TOWARDS A UNIFIED STRATEGY FOR THE PREPROCESSING STEP IN DATA MINING},
booktitle={Proceedings of the 11th International Conference on Enterprise Information Systems - Volume 2: ICEIS,},
year={2009},
pages={230-235},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0002008902300235},
isbn={978-989-8111-85-2},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 11th International Conference on Enterprise Information Systems - Volume 2: ICEIS,
TI - TOWARDS A UNIFIED STRATEGY FOR THE PREPROCESSING STEP IN DATA MINING
SN - 978-989-8111-85-2
AU - Vidrighin Bratu C.
AU - Potolea R.
PY - 2009
SP - 230
EP - 235
DO - 10.5220/0002008902300235