FINDING PROTEIN FAMILY SIMILARITIES IN REAL TIME THROUGH MULTIPLE 3D AND 2D REPRESENTATIONS, INDEXING AND EXHAUSTIVE SEARCHING

Eric Paquet, Herna Lydia Viktor

2009

Abstract

Research suggests that the complex geometric shapes of amino-acid sequence folds often determine their functions. In order to aid domain experts to classify new protein structures, and to be able to identify the functions of such new discoveries, accurate shape-related algorithms for locating similar protein structures are thus needed. To this end, we present our Content-based Analysis of Protein Structure for Retrieval and Indexing system, which locates protein families, and identifies similarities between families, based on the 2D and 3D signatures of protein structures. Our approach is novel in that we utilize five different representations, using a query by prototype approach. These diverse representations provide us with the ability to view a particular protein structure, and the family it belongs to, focusing on (1) the C-α chain, (2) the atomic position, (3) the secondary structure, based on (4) residue type or (5) residue name. Our experimental results indicate that our method is able to accurately locate protein families, when evaluated against the 53.000 entries located within the Protein Data Bank performing an exhaustive search in less than a fraction of a second.

Download


Paper Citation


in Harvard Style

Paquet E. and Lydia Viktor H. (2009). FINDING PROTEIN FAMILY SIMILARITIES IN REAL TIME THROUGH MULTIPLE 3D AND 2D REPRESENTATIONS, INDEXING AND EXHAUSTIVE SEARCHING . In Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2009) ISBN 978-989-674-011-5, pages 127-133. DOI: 10.5220/0002286801270133

in Bibtex Style

@conference{kdir09,
author={Eric Paquet and Herna Lydia Viktor},
title={FINDING PROTEIN FAMILY SIMILARITIES IN REAL TIME THROUGH MULTIPLE 3D AND 2D REPRESENTATIONS, INDEXING AND EXHAUSTIVE SEARCHING},
booktitle={Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2009)},
year={2009},
pages={127-133},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0002286801270133},
isbn={978-989-674-011-5},
}


in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2009)
TI - FINDING PROTEIN FAMILY SIMILARITIES IN REAL TIME THROUGH MULTIPLE 3D AND 2D REPRESENTATIONS, INDEXING AND EXHAUSTIVE SEARCHING
SN - 978-989-674-011-5
AU - Paquet E.
AU - Lydia Viktor H.
PY - 2009
SP - 127
EP - 133
DO - 10.5220/0002286801270133