POPULATING A DOMAIN ONTOLOGY FROM A WEB BIOGRAPHICAL DICTIONARY OF MUSIC - An Unsupervised Rule-based Method to Handle Brazilian Portuguese Texts

Eduardo Motta, Sean Siqueira, Alexandre Andreatta

2009

Abstract

An increasing amount of information is available on the web and usually is expressed as text, representing unstructured or semi-structured data. Semantic information is implicit in these texts, since they are mainly intended for human consumption and interpretation. Since unstructured information is not easily handled automatically, an information extraction process has to be used to identify concepts and establish relations among them. Information extraction outcome can be represented as a domain ontology. Ontologies are an appropriate way to represent structured knowledge bases, enabling sharing, reuse and inference. In this paper, an information extraction process is used for populating a domain ontology. It targets Brazilian Portuguese texts from a biographical dictionary of music, which requires specific tools due to some language unique aspects. An unsupervised rule-based method is proposed. Through this process, latent concepts and relations expressed in natural language can be extracted and represented as an ontology, allowing new uses and visualizations of the content, such as semantically browsing and inferring new knowledge.

Download


Paper Citation


in Harvard Style

Motta E., Siqueira S. and Andreatta A. (2009). POPULATING A DOMAIN ONTOLOGY FROM A WEB BIOGRAPHICAL DICTIONARY OF MUSIC - An Unsupervised Rule-based Method to Handle Brazilian Portuguese Texts . In Proceedings of the Fifth International Conference on Web Information Systems and Technologies - Volume 1: WEBIST, ISBN 978-989-8111-81-4, pages 192-199. DOI: 10.5220/0001842301920199

in Bibtex Style

@conference{webist09,
author={Eduardo Motta and Sean Siqueira and Alexandre Andreatta},
title={POPULATING A DOMAIN ONTOLOGY FROM A WEB BIOGRAPHICAL DICTIONARY OF MUSIC - An Unsupervised Rule-based Method to Handle Brazilian Portuguese Texts},
booktitle={Proceedings of the Fifth International Conference on Web Information Systems and Technologies - Volume 1: WEBIST,},
year={2009},
pages={192-199},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0001842301920199},
isbn={978-989-8111-81-4},
}


in EndNote Style

TY - CONF
JO - Proceedings of the Fifth International Conference on Web Information Systems and Technologies - Volume 1: WEBIST,
TI - POPULATING A DOMAIN ONTOLOGY FROM A WEB BIOGRAPHICAL DICTIONARY OF MUSIC - An Unsupervised Rule-based Method to Handle Brazilian Portuguese Texts
SN - 978-989-8111-81-4
AU - Motta E.
AU - Siqueira S.
AU - Andreatta A.
PY - 2009
SP - 192
EP - 199
DO - 10.5220/0001842301920199