AUGMENTING SEARCH WITH CORPUS-DERIVED SEMANTIC RELEVANCE

Zachary Mason

2007

Abstract

This paper describes a system for doing contextually-steered web search. The system is based on a method for estimating the semantic relevance of a web page to a query. Consider doing a web search for conferences about web search. The query “search conferences” is not effective, as it produces results relevant for the most part to searching over conferences, rather than conferences on the topic of search. The system described in this paper enables queries of the form “search conference context:pagerank”. The context field in this example specifies a preference for results semantically relevant to the term “pagerank”, although there is no requirement that said results contain the word “pagerank” itself. This a more semantic, less lexical way of refining the query than adding literal conjuncts. Contextual search, as implemented in this paper, is based on the Google (Google) search engine. For each query, the top one hundred search results are fetched from Google and sorted according to their relevance to the context query. Relevance is computed as a distance function between the vocabulary vectors associated with a web-page and a query. For queries, the vocabulary vector is formed by aggregating the web-pages in the search results for that query. For web-pages, the vocabulary vector is aggregated from that web-page and other web-pages nearby in link-space.

Download


Paper Citation


in Harvard Style

Mason Z. (2007). AUGMENTING SEARCH WITH CORPUS-DERIVED SEMANTIC RELEVANCE . In Proceedings of the Third International Conference on Web Information Systems and Technologies - Volume 2: WEBIST, ISBN 978-972-8865-78-8, pages 367-371. DOI: 10.5220/0001259403670371

in Bibtex Style

@conference{webist07,
author={Zachary Mason},
title={AUGMENTING SEARCH WITH CORPUS-DERIVED SEMANTIC RELEVANCE},
booktitle={Proceedings of the Third International Conference on Web Information Systems and Technologies - Volume 2: WEBIST,},
year={2007},
pages={367-371},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0001259403670371},
isbn={978-972-8865-78-8},
}


in EndNote Style

TY - CONF
JO - Proceedings of the Third International Conference on Web Information Systems and Technologies - Volume 2: WEBIST,
TI - AUGMENTING SEARCH WITH CORPUS-DERIVED SEMANTIC RELEVANCE
SN - 978-972-8865-78-8
AU - Mason Z.
PY - 2007
SP - 367
EP - 371
DO - 10.5220/0001259403670371