Towards an Approach to Select Features from Low Quality Datasets

José Manuel Cadenas, María del Carmen Garrido, Raquel Martínez

2012

Abstract

Feature selection is an active research in machine learning. The main idea of feature selection is to choose a subset of available features, by eliminating features with little or no predictive information, and features strongly correlated. There are many approaches for feature selection, but most of them can only work with crisp data. Until our knowledge there are not many approaches which can directly work with both crisp and low quality (imprecise and uncertain) data. That is why, we propose a new method of feature selection which can handle both crisp and low quality data. The proposed approach integrates filter and wrapper methods into a sequential search procedure with improved classification accuracy of the features selected. This approach consists of steps following: (1) Scaling and discretization process of the feature set; and feature pre-selection using the discretization process (filter); (2) Ranking process of the feature pre-selection using a Fuzzy Random Forest ensemble; (3) Wrapper feature selection using a Fuzzy Decision Tree technique based on cross-validation. The efficiency and effectiveness of the approach is proved through several experiments with low quality datasets. Approach shows an excellent performance, not only classification accuracy, but also with respect to the number of features selected.

Download


Paper Citation


in Harvard Style

Manuel Cadenas J., del Carmen Garrido M. and Martínez R. (2012). Towards an Approach to Select Features from Low Quality Datasets . In Proceedings of the 4th International Joint Conference on Computational Intelligence - Volume 1: FCTA, (IJCCI 2012) ISBN 978-989-8565-33-4, pages 357-366. DOI: 10.5220/0004153503570366

in Bibtex Style

@conference{fcta12,
author={José Manuel Cadenas and María del Carmen Garrido and Raquel Martínez},
title={Towards an Approach to Select Features from Low Quality Datasets},
booktitle={Proceedings of the 4th International Joint Conference on Computational Intelligence - Volume 1: FCTA, (IJCCI 2012)},
year={2012},
pages={357-366},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004153503570366},
isbn={978-989-8565-33-4},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 4th International Joint Conference on Computational Intelligence - Volume 1: FCTA, (IJCCI 2012)
TI - Towards an Approach to Select Features from Low Quality Datasets
SN - 978-989-8565-33-4
AU - Manuel Cadenas J.
AU - del Carmen Garrido M.
AU - Martínez R.
PY - 2012
SP - 357
EP - 366
DO - 10.5220/0004153503570366