DNA AND NATURAL LANGUAGES - Text Mining

Gemma Bel-Enguix, Veronica Dahl, M. Dolores Jimenez-lopez

2009

Abstract

We present, discuss and exemplify a fully implemented model of text mining that can be applied to spoken languages as well as to molecular biology languages. This is based in the model presented in (Zahariev et al., 2009) oriented to discovering DNA barcodes for sequences. The novelty of our methodology is the use of Constraint Based Reasoning to detect string repetitions through unification, by introducing a new general rule for matching. We claim that the same method can be succesfully applied to mining natural language texts.

Download


Paper Citation


in Harvard Style

Bel-Enguix G., Dahl V. and Dolores Jimenez-lopez M. (2009). DNA AND NATURAL LANGUAGES - Text Mining . In Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2009) ISBN 978-989-674-011-5, pages 140-145. DOI: 10.5220/0002292201400145

in Bibtex Style

@conference{kdir09,
author={Gemma Bel-Enguix and Veronica Dahl and M. Dolores Jimenez-lopez},
title={DNA AND NATURAL LANGUAGES - Text Mining},
booktitle={Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2009)},
year={2009},
pages={140-145},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0002292201400145},
isbn={978-989-674-011-5},
}


in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2009)
TI - DNA AND NATURAL LANGUAGES - Text Mining
SN - 978-989-674-011-5
AU - Bel-Enguix G.
AU - Dahl V.
AU - Dolores Jimenez-lopez M.
PY - 2009
SP - 140
EP - 145
DO - 10.5220/0002292201400145