PROTEIN DOMAIN PHYLOGENIES - Information Theory and Evolutionary Dynamics

K. Hamacher

2010

Abstract

The ever-increasing wealth of whole-genome information prompts for phylogenies based on entire genomes. The quest for a good distance measure, however, poses a big challenge; e.g. because of large-scale evolutionary events such as genomic rearrangements or inversions. We introduce here an information theory driven measure that for the encoded protein domain composition of genomes as protein domains are key evolutionary entities. Thus the new method focuses on selective advantageous events. As evolving different protein domain compositions is more complex than single point mutations, the method makes longer evolutionary times accessible. Illustrating the new methodology we extract several phylogenetic trees for some 700 genomes, e.g. the separation of the three kingdoms of life, trees for mammals and bacillales, and a speculative result for plants (monocotyledons and dicotyledons). The method itself is shown to be robust against incomplete genome sampling. It has a consistent interpretation in both, information space at the sequence/information level and at the level of stochastic, evolutionary dynamics. In contrast to established protocols it becomes more accurate as more organisms are taken into account. Finally we show the equivalence to a (simplified) model of evolutionary dynamics of proteomes.

Download


Paper Citation


in Harvard Style

Hamacher K. (2010). PROTEIN DOMAIN PHYLOGENIES - Information Theory and Evolutionary Dynamics . In Proceedings of the First International Conference on Bioinformatics - Volume 1: BIOINFORMATICS, (BIOSTEC 2010) ISBN 978-989-674-019-1, pages 114-122. DOI: 10.5220/0002710101140122

in Bibtex Style

@conference{bioinformatics10,
author={K. Hamacher},
title={PROTEIN DOMAIN PHYLOGENIES - Information Theory and Evolutionary Dynamics},
booktitle={Proceedings of the First International Conference on Bioinformatics - Volume 1: BIOINFORMATICS, (BIOSTEC 2010)},
year={2010},
pages={114-122},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0002710101140122},
isbn={978-989-674-019-1},
}


in EndNote Style

TY - CONF
JO - Proceedings of the First International Conference on Bioinformatics - Volume 1: BIOINFORMATICS, (BIOSTEC 2010)
TI - PROTEIN DOMAIN PHYLOGENIES - Information Theory and Evolutionary Dynamics
SN - 978-989-674-019-1
AU - Hamacher K.
PY - 2010
SP - 114
EP - 122
DO - 10.5220/0002710101140122