Improving Toponym Disambiguation by Iteratively Enhancing Certainty of Extraction

Mena B. Habib, Maurice van Keulen

2012

Abstract

Named entity extraction (NEE) and disambiguation (NED) have received much attention in recent years. Typical fields addressing these topics are information retrieval, natural language processing, and semantic web. This paper addresses two problems with toponym extraction and disambiguation (as a representative example of named entities). First, almost no existing works examine the extraction and disambiguation interdependency. Second, existing disambiguation techniques mostly take as input extracted named entities without considering the uncertainty and imperfection of the extraction process. It is the aim of this paper to investigate both avenues and to show that explicit handling of the uncertainty of annotation has much potential for making both extraction and disambiguation more robust. We conducted experiments with a set of holiday home descriptions with the aim to extract and disambiguate toponyms. We show that the extraction confidence probabilities are useful in enhancing the effectiveness of disambiguation. Reciprocally, retraining the extraction models with information automatically derived from the disambiguation results, improves the extraction models. This mutual reinforcement is shown to even have an effect after several automatic iterations.

Download


Paper Citation


in Harvard Style

B. Habib M. and van Keulen M. (2012). Improving Toponym Disambiguation by Iteratively Enhancing Certainty of Extraction . In Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: SSTM, (IC3K 2012) ISBN 978-989-8565-29-7, pages 399-410. DOI: 10.5220/0004174903990410

in Bibtex Style

@conference{sstm12,
author={Mena B. Habib and Maurice van Keulen},
title={Improving Toponym Disambiguation by Iteratively Enhancing Certainty of Extraction},
booktitle={Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: SSTM, (IC3K 2012)},
year={2012},
pages={399-410},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004174903990410},
isbn={978-989-8565-29-7},
}


in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: SSTM, (IC3K 2012)
TI - Improving Toponym Disambiguation by Iteratively Enhancing Certainty of Extraction
SN - 978-989-8565-29-7
AU - B. Habib M.
AU - van Keulen M.
PY - 2012
SP - 399
EP - 410
DO - 10.5220/0004174903990410