Impact on Bayesian Networks Classifiers When Learning from Imbalanced Datasets

M. Julia Flores, José A. Gámez

2015

Abstract

In this paper we present a study on the behaviour of some representative Bayesian Networks Classifiers (BNCs), when the dataset they are learned from presents imbalanced data, that is, there are far fewer cases labelled with a particular class value than with the other ones (assuming binary classification problems). This is a typical source of trouble in some datasets, and the development of more robust techniques is currently very important. In this study, we have selected a benchmark of 129 imbalanced datasets, and performed an analytical approach focusing on BNCs. Our results show good performance of these classifiers, that outperform decision trees (C4.5). Finally, an algorithm to improve the performance of any BNC is also given. We have carried out an experimentation where we show how the using of oversampling of the minority class to achieve the desired value for the imbalance ratio (IR), which is the division of the number of cases for the majority class by the cases of the minority class. From this work we can conclude that BNCs show a very good performance for imbalanced datasets, and that our proposal enhance their results for those datasets that provided poor results.

Download


Paper Citation


in Harvard Style

Flores M. and Gámez J. (2015). Impact on Bayesian Networks Classifiers When Learning from Imbalanced Datasets . In Proceedings of the International Conference on Agents and Artificial Intelligence - Volume 2: ICAART, ISBN 978-989-758-074-1, pages 382-389. DOI: 10.5220/0005201103820389

in Bibtex Style

@conference{icaart15,
author={M. Julia Flores and José A. Gámez},
title={Impact on Bayesian Networks Classifiers When Learning from Imbalanced Datasets},
booktitle={Proceedings of the International Conference on Agents and Artificial Intelligence - Volume 2: ICAART,},
year={2015},
pages={382-389},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005201103820389},
isbn={978-989-758-074-1},
}


in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Agents and Artificial Intelligence - Volume 2: ICAART,
TI - Impact on Bayesian Networks Classifiers When Learning from Imbalanced Datasets
SN - 978-989-758-074-1
AU - Flores M.
AU - Gámez J.
PY - 2015
SP - 382
EP - 389
DO - 10.5220/0005201103820389