Empirical Study of Domain Adaptation with Naïve Bayes on the Task of Splice Site Prediction

Nic Herndon, Doina Caragea

2014

Abstract

For many machine learning problems, training an accurate classifier in a supervised setting requires a substantial volume of labeled data. While large volumes of labeled data are currently available for some of these problems, little or no labeled data exists for others. Manually labeling data can be costly and time consuming. An alternative is to learn classifiers in a domain adaptation setting in which existing labeled data can be leveraged from a related problem, referred to as source domain, in conjunction with a small amount of labeled data and large amount of unlabeled data for the problem of interest, or target domain. In this paper, we propose two similar domain adaptation classifiers based on a na¨ıve Bayes algorithm. We evaluate these classifiers on the difficult task of splice site prediction, essential for gene prediction. Results show that the algorithms correctly classified instances, with highest average area under precision-recall curve (auPRC) values between 18.46% and 78.01%.

Download


Paper Citation


in Harvard Style

Herndon N. and Caragea D. (2014). Empirical Study of Domain Adaptation with Naïve Bayes on the Task of Splice Site Prediction . In Proceedings of the International Conference on Bioinformatics Models, Methods and Algorithms - Volume 1: BIOINFORMATICS, (BIOSTEC 2014) ISBN 978-989-758-012-3, pages 57-67. DOI: 10.5220/0004806800570067

in Bibtex Style

@conference{bioinformatics14,
author={Nic Herndon and Doina Caragea},
title={Empirical Study of Domain Adaptation with Naïve Bayes on the Task of Splice Site Prediction},
booktitle={Proceedings of the International Conference on Bioinformatics Models, Methods and Algorithms - Volume 1: BIOINFORMATICS, (BIOSTEC 2014)},
year={2014},
pages={57-67},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004806800570067},
isbn={978-989-758-012-3},
}


in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Bioinformatics Models, Methods and Algorithms - Volume 1: BIOINFORMATICS, (BIOSTEC 2014)
TI - Empirical Study of Domain Adaptation with Naïve Bayes on the Task of Splice Site Prediction
SN - 978-989-758-012-3
AU - Herndon N.
AU - Caragea D.
PY - 2014
SP - 57
EP - 67
DO - 10.5220/0004806800570067