A CLASS SPECIFIC DIMENSIONALITY REDUCTION FRAMEWORK FOR CLASS IMBALANCE PROBLEM: CPC SMOTE

T. Maruthi Padmaja, Bapi S. Raju, P. Radha Krishna

2010

Abstract

The performance of the conventional classification algorithms deteriorates due to the class imbalance problem, which occurs when one class of data severely outnumbers the other class. On the other hand the data dimensionality also plays a crucial role in performance deterioration of classification algorithms. Principal Component Analysis (PCA) is a widely used technique for dimensionality reduction. Due to unsupervised nature of PCA, it is not adequate enough to hold class discriminative information for classification problems. In case of unbalanced datasets the occurrence of minority class samples are rare or obtaining them are costly. Moreover, the misclassification cost associated with minority class samples is higher than non-minority class samples. Capturing and validating labeled samples, particularly minority class samples, in PCA subspace is an important issue. We propose a class specific dimensionality reduction and oversampling framework named CPC SMOTE to address this issue. The framework is based on combining class specific PCA subspaces to hold informative features from minority as well as majority class and oversample the combined class specific PCA subspace to compensate lack of data problem. We evaluated the proposed approach using 1 simulated and 5 UCI repository datasets. The evaluation show that the framework is effective when compared to PCA and SMOTE preprocessing methods.

Download


Paper Citation


in Harvard Style

Maruthi Padmaja T., S. Raju B. and Radha Krishna P. (2010). A CLASS SPECIFIC DIMENSIONALITY REDUCTION FRAMEWORK FOR CLASS IMBALANCE PROBLEM: CPC SMOTE . In Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2010) ISBN 978-989-8425-28-7, pages 237-242. DOI: 10.5220/0003092502370242

in Bibtex Style

@conference{kdir10,
author={T. Maruthi Padmaja and Bapi S. Raju and P. Radha Krishna},
title={A CLASS SPECIFIC DIMENSIONALITY REDUCTION FRAMEWORK FOR CLASS IMBALANCE PROBLEM: CPC SMOTE},
booktitle={Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2010)},
year={2010},
pages={237-242},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003092502370242},
isbn={978-989-8425-28-7},
}


in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2010)
TI - A CLASS SPECIFIC DIMENSIONALITY REDUCTION FRAMEWORK FOR CLASS IMBALANCE PROBLEM: CPC SMOTE
SN - 978-989-8425-28-7
AU - Maruthi Padmaja T.
AU - S. Raju B.
AU - Radha Krishna P.
PY - 2010
SP - 237
EP - 242
DO - 10.5220/0003092502370242