USE OF DOMAIN KNOWLEDGE FOR DIMENSION REDUCTION - Application to Mining of Drug Side Effects

Emmanuel Bresso, Sidahmed Benabderrahmane, Malika Smail-Tabbone, Gino Marchetti, Arnaud Sinan Karaboga, Michel Souchet, Amedeo Napoli, Marie-Dominique Devignes

2011

Abstract

High dimensionality of datasets can impair the execution of most data mining programs and lead to the production of numerous and complex patterns, inappropriate for interpretation by the experts. Thus, dimension reduction of datasets constitutes an important research orientation in which the role of domain knowledge is essential. We present here a new approach for reducing dimensions in a dataset by exploiting semantic relationships between terms of an ontology structured as a rooted directed acyclic graph. Term clustering is performed thanks to the recently described IntelliGO similarity measure and the term clusters are then used as descriptors for data representation. The strategy reported here is applied to a set of drugs associated with their side effects collected from the SIDER database. Terms describing side effects belong to the MedDRA terminology. The hierarchical clustering of about 1,200 MedDRA terms into an optimal collection of 112 term clusters leads to a reduced data representation. Two data mining experiments are then conducted to illustrate the advantage of using this reduced representation.

Download


Paper Citation


in Harvard Style

Bresso E., Benabderrahmane S., Smail-Tabbone M., Marchetti G., Sinan Karaboga A., Souchet M., Napoli A. and Devignes M. (2011). USE OF DOMAIN KNOWLEDGE FOR DIMENSION REDUCTION - Application to Mining of Drug Side Effects . In Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2011) ISBN 978-989-8425-79-9, pages 263-268. DOI: 10.5220/0003662602710276

in Bibtex Style

@conference{kdir11,
author={Emmanuel Bresso and Sidahmed Benabderrahmane and Malika Smail-Tabbone and Gino Marchetti and Arnaud Sinan Karaboga and Michel Souchet and Amedeo Napoli and Marie-Dominique Devignes},
title={USE OF DOMAIN KNOWLEDGE FOR DIMENSION REDUCTION - Application to Mining of Drug Side Effects},
booktitle={Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2011)},
year={2011},
pages={263-268},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003662602710276},
isbn={978-989-8425-79-9},
}


in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2011)
TI - USE OF DOMAIN KNOWLEDGE FOR DIMENSION REDUCTION - Application to Mining of Drug Side Effects
SN - 978-989-8425-79-9
AU - Bresso E.
AU - Benabderrahmane S.
AU - Smail-Tabbone M.
AU - Marchetti G.
AU - Sinan Karaboga A.
AU - Souchet M.
AU - Napoli A.
AU - Devignes M.
PY - 2011
SP - 263
EP - 268
DO - 10.5220/0003662602710276