USING THE STRUCTURAL CONTENT OF DOCUMENTS TO AUTOMATICALLY GENERATE QUALITY METADATA

Lars Fredrik Høimyr Edvardsen, Ingeborg Torvik Sølvberg, Trond Aalberg, Hallvard Trætteberg

2009

Abstract

Giving search engines access to high quality document metadata is crucial for efficient document retrieval efforts on the Internet and on corporate Intranets. Presence of such metadata is currently sparsely present. This paper presents how the structural content of document files can be used for Automatic Metadata Generation (AMG) efforts, basing efforts directly on the documents’ content (code) and enabling effective usage of combinations of AMG algorithms for additional harvesting and extraction efforts. This enables usage of AMG efforts to generate high quality metadata in terms of syntax, semantics and pragmatics, from non-homogenous data sources in terms of visual characteristics and language of their intellectual content.

Download


Paper Citation


in Harvard Style

Edvardsen L., Sølvberg I., Aalberg T. and Trætteberg H. (2009). USING THE STRUCTURAL CONTENT OF DOCUMENTS TO AUTOMATICALLY GENERATE QUALITY METADATA . In Proceedings of the Fifth International Conference on Web Information Systems and Technologies - Volume 1: WEBIST, ISBN 978-989-8111-81-4, pages 354-363. DOI: 10.5220/0001841003540363

in Bibtex Style

@conference{webist09,
author={Lars Fredrik Høimyr Edvardsen and Ingeborg Torvik Sølvberg and Trond Aalberg and Hallvard Trætteberg},
title={USING THE STRUCTURAL CONTENT OF DOCUMENTS TO AUTOMATICALLY GENERATE QUALITY METADATA},
booktitle={Proceedings of the Fifth International Conference on Web Information Systems and Technologies - Volume 1: WEBIST,},
year={2009},
pages={354-363},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0001841003540363},
isbn={978-989-8111-81-4},
}


in EndNote Style

TY - CONF
JO - Proceedings of the Fifth International Conference on Web Information Systems and Technologies - Volume 1: WEBIST,
TI - USING THE STRUCTURAL CONTENT OF DOCUMENTS TO AUTOMATICALLY GENERATE QUALITY METADATA
SN - 978-989-8111-81-4
AU - Edvardsen L.
AU - Sølvberg I.
AU - Aalberg T.
AU - Trætteberg H.
PY - 2009
SP - 354
EP - 363
DO - 10.5220/0001841003540363