UNREVEALING BIOLOGICAL PROCESS WITH LINEAR ALGEBRA - Extracting Patterns from Noisy Data

Bráulio Roberto Gonçalves Marinho Couto, Marcelo Matos Santoro, Marcos Augusto dos Santos

2011

Abstract

Extracting patterns from protein sequence data is one of the challenges of computational biology. Here we use linear algebra to analyze sequences without the requirement of multiples alignments. In this study, the singular value decomposition (SVD) of a sparse p-peptide frequency matrix (M) is used to detect and extract signals from noisy protein data (M = USVT). The central matrix S is diagonal and contains the singular values of M in decreasing order. Here we give sense to the biological significance of the SVD: the singular value spectrum visualized as scree plots unreveals the main components, the process that exists hidden in the database. This information can be used in many applications as clustering, gene expression analysis, immune response pattern identification, characterization of protein molecular dynamics and phylogenetic inference. The visualization of singular value spectrum from SVD analysis shows how many processes can be hidden in database and can help biologists to detect and extract small signals from noisy data.

Download


Paper Citation


in Harvard Style

Roberto Gonçalves Marinho Couto B., Matos Santoro M. and Augusto dos Santos M. (2011). UNREVEALING BIOLOGICAL PROCESS WITH LINEAR ALGEBRA - Extracting Patterns from Noisy Data . In Proceedings of the International Conference on Bioinformatics Models, Methods and Algorithms - Volume 1: BIOINFORMATICS, (BIOSTEC 2011) ISBN 978-989-8425-36-2, pages 313-317. DOI: 10.5220/0003164103130317

in Bibtex Style

@conference{bioinformatics11,
author={Bráulio Roberto Gonçalves Marinho Couto and Marcelo Matos Santoro and Marcos Augusto dos Santos},
title={UNREVEALING BIOLOGICAL PROCESS WITH LINEAR ALGEBRA - Extracting Patterns from Noisy Data},
booktitle={Proceedings of the International Conference on Bioinformatics Models, Methods and Algorithms - Volume 1: BIOINFORMATICS, (BIOSTEC 2011)},
year={2011},
pages={313-317},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003164103130317},
isbn={978-989-8425-36-2},
}


in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Bioinformatics Models, Methods and Algorithms - Volume 1: BIOINFORMATICS, (BIOSTEC 2011)
TI - UNREVEALING BIOLOGICAL PROCESS WITH LINEAR ALGEBRA - Extracting Patterns from Noisy Data
SN - 978-989-8425-36-2
AU - Roberto Gonçalves Marinho Couto B.
AU - Matos Santoro M.
AU - Augusto dos Santos M.
PY - 2011
SP - 313
EP - 317
DO - 10.5220/0003164103130317