CONCEPT BASED QUERY AND DOCUMENT EXPANSION USING HIDDEN MARKOV MODEL

Jiuling Zhang, Zuoda Liu, Beixing Deng, Xing Li

2009

Abstract

Query and document expansion techniques have been widely studied for improving the effectiveness of information retrieval. In this paper, we propose a method for concept based query and document expansion employing the hidden Markov model(HMM). WordNet is adopted as the thesaurus set of concepts and terms. Expanded query and document candidates are yielded basing on the concepts which are recovered from the original query/document term sequence by employing the hidden Markov model. Using 50000 web pages crawled from universities as our test collection and Lemur Toolkit as our retrieval tool, preliminary experiment on query expansion show that the score of top 20 retrieved documents have a 2.7113 average score increment. Numbers of documents with score higher than a given value also increased significantly.

Download


Paper Citation


in Harvard Style

Zhang J., Liu Z., Deng B. and Li X. (2009). CONCEPT BASED QUERY AND DOCUMENT EXPANSION USING HIDDEN MARKOV MODEL . In Proceedings of the Fifth International Conference on Web Information Systems and Technologies - Volume 1: WEBIST, ISBN 978-989-8111-81-4, pages 688-691. DOI: 10.5220/0001842506880691

in Bibtex Style

@conference{webist09,
author={Jiuling Zhang and Zuoda Liu and Beixing Deng and Xing Li},
title={CONCEPT BASED QUERY AND DOCUMENT EXPANSION USING HIDDEN MARKOV MODEL},
booktitle={Proceedings of the Fifth International Conference on Web Information Systems and Technologies - Volume 1: WEBIST,},
year={2009},
pages={688-691},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0001842506880691},
isbn={978-989-8111-81-4},
}


in EndNote Style

TY - CONF
JO - Proceedings of the Fifth International Conference on Web Information Systems and Technologies - Volume 1: WEBIST,
TI - CONCEPT BASED QUERY AND DOCUMENT EXPANSION USING HIDDEN MARKOV MODEL
SN - 978-989-8111-81-4
AU - Zhang J.
AU - Liu Z.
AU - Deng B.
AU - Li X.
PY - 2009
SP - 688
EP - 691
DO - 10.5220/0001842506880691