ON THE USE OF CORRESPONDENCE ANALYSIS TO LEARN SEED ONTOLOGIES FROM TEXT

Davide Eynard, Fabio Marfia, Matteo Matteucci

2010

Abstract

In the present work we show our approach to generate hierarchies of concepts in the form of ontologies starting from free text. This approach relies on the statistical model of Correspondence Analysis to analyze term occurrences in text, identify the main concepts it refers to, and retrieve semantic relationships between them. We present a tool which is able to apply different methods for the generation of ontologies from text, namely hierarchy generation from hierarchical clustering representation, search for Hearst Patterns on the Web, and bootstrapping. Our evaluation shows that the precision in the generation of hierarchies of the tool is attested to be around 60% for the best automatic approach and around 90% for the best human-assisted approach.

Download


Paper Citation


in Harvard Style

Eynard D., Marfia F. and Matteucci M. (2010). ON THE USE OF CORRESPONDENCE ANALYSIS TO LEARN SEED ONTOLOGIES FROM TEXT . In Proceedings of the International Conference on Knowledge Engineering and Ontology Development - Volume 1: KEOD, (IC3K 2010) ISBN 978-989-8425-29-4, pages 430-439. DOI: 10.5220/0003102204300439

in Bibtex Style

@conference{keod10,
author={Davide Eynard and Fabio Marfia and Matteo Matteucci},
title={ON THE USE OF CORRESPONDENCE ANALYSIS TO LEARN SEED ONTOLOGIES FROM TEXT},
booktitle={Proceedings of the International Conference on Knowledge Engineering and Ontology Development - Volume 1: KEOD, (IC3K 2010)},
year={2010},
pages={430-439},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003102204300439},
isbn={978-989-8425-29-4},
}


in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Knowledge Engineering and Ontology Development - Volume 1: KEOD, (IC3K 2010)
TI - ON THE USE OF CORRESPONDENCE ANALYSIS TO LEARN SEED ONTOLOGIES FROM TEXT
SN - 978-989-8425-29-4
AU - Eynard D.
AU - Marfia F.
AU - Matteucci M.
PY - 2010
SP - 430
EP - 439
DO - 10.5220/0003102204300439