CONCEPT BASED QUERY AND DOCUMENT EXPANSION USING HIDDEN MARKOV MODEL
Jiuling Zhang, Zuoda Liu, Beixing Deng, Xing Li
2009
Abstract
Query and document expansion techniques have been widely studied for improving the effectiveness of information retrieval. In this paper, we propose a method for concept based query and document expansion employing the hidden Markov model(HMM). WordNet is adopted as the thesaurus set of concepts and terms. Expanded query and document candidates are yielded basing on the concepts which are recovered from the original query/document term sequence by employing the hidden Markov model. Using 50000 web pages crawled from universities as our test collection and Lemur Toolkit as our retrieval tool, preliminary experiment on query expansion show that the score of top 20 retrieved documents have a 2.7113 average score increment. Numbers of documents with score higher than a given value also increased significantly.
DownloadPaper Citation
in Harvard Style
Zhang J., Liu Z., Deng B. and Li X. (2009). CONCEPT BASED QUERY AND DOCUMENT EXPANSION USING HIDDEN MARKOV MODEL . In Proceedings of the Fifth International Conference on Web Information Systems and Technologies - Volume 1: WEBIST, ISBN 978-989-8111-81-4, pages 688-691. DOI: 10.5220/0001842506880691
in Bibtex Style
@conference{webist09,
author={Jiuling Zhang and Zuoda Liu and Beixing Deng and Xing Li},
title={CONCEPT BASED QUERY AND DOCUMENT EXPANSION USING HIDDEN MARKOV MODEL},
booktitle={Proceedings of the Fifth International Conference on Web Information Systems and Technologies - Volume 1: WEBIST,},
year={2009},
pages={688-691},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0001842506880691},
isbn={978-989-8111-81-4},
}
in EndNote Style
TY - CONF
JO - Proceedings of the Fifth International Conference on Web Information Systems and Technologies - Volume 1: WEBIST,
TI - CONCEPT BASED QUERY AND DOCUMENT EXPANSION USING HIDDEN MARKOV MODEL
SN - 978-989-8111-81-4
AU - Zhang J.
AU - Liu Z.
AU - Deng B.
AU - Li X.
PY - 2009
SP - 688
EP - 691
DO - 10.5220/0001842506880691