AD-HOC GEOREFERENCING OF WEB-PAGES USING STREET-NAME PREFIX TREES

Andrei Tabarcea, Ville Hautamäki, Pasi Fränti

2010

Abstract

A bottleneck of constructing location-based web searches is that most web-pages do not contain any explicit geocoding such as geotags. Alternative solution can be based on ad-hoc georeferencing which relies on street addresses, but the problem is how to extract and validate the address strings from free-form text. We propose a rule-based solution that detects address-based locations using a gazetteer and street-name prefix trees created from the gazetteer. We compare this approach against a method that doesn’t require a gazetteer (a heuristic method that assumes that street-name has a certain structure) and a method that also uses data structures created from the gazetteer in the form of street-name arrays. Experiments using our location based search engine prototype (MOPSI) for Finland and Singapore, show that the proposed prefix-tree solution is twice as fast and 10% more accurate than its rule-based alternative and 10 times faster if an array structure is used when accessing the gazetteer.

Download


Paper Citation


in Harvard Style

Tabarcea A., Hautamäki V. and Fränti P. (2010). AD-HOC GEOREFERENCING OF WEB-PAGES USING STREET-NAME PREFIX TREES . In Proceedings of the 6th International Conference on Web Information Systems and Technology - Volume 1: WEBIST, ISBN 978-989-674-025-2, pages 237-244. DOI: 10.5220/0002804002370244

in Bibtex Style

@conference{webist10,
author={Andrei Tabarcea and Ville Hautamäki and Pasi Fränti},
title={AD-HOC GEOREFERENCING OF WEB-PAGES USING STREET-NAME PREFIX TREES},
booktitle={Proceedings of the 6th International Conference on Web Information Systems and Technology - Volume 1: WEBIST,},
year={2010},
pages={237-244},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0002804002370244},
isbn={978-989-674-025-2},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 6th International Conference on Web Information Systems and Technology - Volume 1: WEBIST,
TI - AD-HOC GEOREFERENCING OF WEB-PAGES USING STREET-NAME PREFIX TREES
SN - 978-989-674-025-2
AU - Tabarcea A.
AU - Hautamäki V.
AU - Fränti P.
PY - 2010
SP - 237
EP - 244
DO - 10.5220/0002804002370244