Using a Random Forest Classifier to Find Nuclear Export Signals in Proteins of Arabidopsis thaliana

Claudia Rubiano, Thomas Merkle, Tim W. Nattkemper

2013

Abstract

This paper presents a new computational strategy for predicting Nuclear Export Signals (NESs) in proteins of the model plant Arabidopsis thaliana based on a random forest classifier. NESs are amino acid sequences that enable a protein to interact with a nuclear receptor and in this way to be exported from the nucleus to the cytoplasm. The proposed classifier uses two kinds of features, the sequence of the NESs expressed as the score obtained from a HMM profile and physicochemical properties of the amino acid residues expressed as amino acid index values. Around 5000 proteins from the total of protein sequences from Arabidopsis were predicted as containing NESs. A small group of these proteins was experimentally tested for the actual presence of an NES. 11 out of 13 tested proteins showed positive interaction with the receptor Exportin 1 (XPO1a) from Arabidopsis in yeast two-hybrid assays, which indicates they contain NESs. The experimental validation of the nuclear export activity in a selected group of proteins is an indicator of the potential usefulness of the tool. From the biological perspective, the nuclear export activity observed in those proteins strongly suggests that nucleo-cytoplasmic partitioning could be involved in regulation of their functions.

Download


Paper Citation


in Harvard Style

Rubiano C., Merkle T. and W. Nattkemper T. (2013). Using a Random Forest Classifier to Find Nuclear Export Signals in Proteins of Arabidopsis thaliana . In Proceedings of the International Conference on Bioinformatics Models, Methods and Algorithms - Volume 1: BIOINFORMATICS, (BIOSTEC 2013) ISBN 978-989-8565-35-8, pages 98-104. DOI: 10.5220/0004192200980104

in Bibtex Style

@conference{bioinformatics13,
author={Claudia Rubiano and Thomas Merkle and Tim W. Nattkemper},
title={Using a Random Forest Classifier to Find Nuclear Export Signals in Proteins of Arabidopsis thaliana},
booktitle={Proceedings of the International Conference on Bioinformatics Models, Methods and Algorithms - Volume 1: BIOINFORMATICS, (BIOSTEC 2013)},
year={2013},
pages={98-104},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004192200980104},
isbn={978-989-8565-35-8},
}


in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Bioinformatics Models, Methods and Algorithms - Volume 1: BIOINFORMATICS, (BIOSTEC 2013)
TI - Using a Random Forest Classifier to Find Nuclear Export Signals in Proteins of Arabidopsis thaliana
SN - 978-989-8565-35-8
AU - Rubiano C.
AU - Merkle T.
AU - W. Nattkemper T.
PY - 2013
SP - 98
EP - 104
DO - 10.5220/0004192200980104