IMPROVING THE PERFORMANCE OF THE RIPPER IN INSURANCE RISK CLASSIFICATION - A Comparitive Study using Feature Selection

Mlungisi Duma, Bhekisipho Twala, Tshilidzi Marwala, Fulufhelo V. Nelwamondo

2011

Abstract

The Ripper algorithm is designed to generate rule sets for large datasets with many features. However, it was shown that the algorithm struggles with classification performance in the presence of missing data. The algorithm struggles to classify instances when the quality of the data deteriorates as a result of increasing missing data. In this paper, feature selection technique is used to help improve the classification performance of the Ripper algorithm. Principal component analysis and evidence automatic relevance determination techniques are chosen to improve the performance of the Ripper. A comparison is done to see which technique helps the algorithm improve the most. Training datasets with completely observable data were used to construct the algorithm, and testing datasets with missing values were used for measuring accuracy. The results showed that principal component analysis is a better feature selection for the Ripper. The results show that with principal component analysis, the classification performance improves significantly as well as increase in resilience in the presence of escalating missing data.

Download


Paper Citation


in Harvard Style

Duma M., Twala B., Marwala T. and V. Nelwamondo F. (2011). IMPROVING THE PERFORMANCE OF THE RIPPER IN INSURANCE RISK CLASSIFICATION - A Comparitive Study using Feature Selection . In Proceedings of the 8th International Conference on Informatics in Control, Automation and Robotics - Volume 1: ICINCO, ISBN 978-989-8425-74-4, pages 203-210. DOI: 10.5220/0003531902030210

in Bibtex Style

@conference{icinco11,
author={Mlungisi Duma and Bhekisipho Twala and Tshilidzi Marwala and Fulufhelo V. Nelwamondo},
title={IMPROVING THE PERFORMANCE OF THE RIPPER IN INSURANCE RISK CLASSIFICATION - A Comparitive Study using Feature Selection},
booktitle={Proceedings of the 8th International Conference on Informatics in Control, Automation and Robotics - Volume 1: ICINCO,},
year={2011},
pages={203-210},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003531902030210},
isbn={978-989-8425-74-4},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 8th International Conference on Informatics in Control, Automation and Robotics - Volume 1: ICINCO,
TI - IMPROVING THE PERFORMANCE OF THE RIPPER IN INSURANCE RISK CLASSIFICATION - A Comparitive Study using Feature Selection
SN - 978-989-8425-74-4
AU - Duma M.
AU - Twala B.
AU - Marwala T.
AU - V. Nelwamondo F.
PY - 2011
SP - 203
EP - 210
DO - 10.5220/0003531902030210