CRAWLING DEEP WEB CONTENT THROUGH QUERY FORMS

Jun Liu, Zhaohui Wu, Lu Jiang, Qinghua Zheng, Xiao Liu

2009

Abstract

This paper proposes the concept of Minimum Executable Pattern (MEP), and then presents a MEP generation method and a MEP-based Deep Web adaptive query method. The query method extends query interface from single textbox to MEP set, and generates local-optimal query by choosing a MEP and a keyword vector of the MEP. Our method overcomes the problem of “data islands” to a certain extent which results from deficiency of current methods. The experimental results on six real-world Deep Web sites show that our method outperforms existing methods in terms of query capability and applicability.

Download


Paper Citation


in Harvard Style

Liu J., Wu Z., Jiang L., Zheng Q. and Liu X. (2009). CRAWLING DEEP WEB CONTENT THROUGH QUERY FORMS . In Proceedings of the Fifth International Conference on Web Information Systems and Technologies - Volume 1: WEBIST, ISBN 978-989-8111-81-4, pages 629-637. DOI: 10.5220/0001830806290637

in Bibtex Style

@conference{webist09,
author={Jun Liu and Zhaohui Wu and Lu Jiang and Qinghua Zheng and Xiao Liu},
title={CRAWLING DEEP WEB CONTENT THROUGH QUERY FORMS},
booktitle={Proceedings of the Fifth International Conference on Web Information Systems and Technologies - Volume 1: WEBIST,},
year={2009},
pages={629-637},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0001830806290637},
isbn={978-989-8111-81-4},
}


in EndNote Style

TY - CONF
JO - Proceedings of the Fifth International Conference on Web Information Systems and Technologies - Volume 1: WEBIST,
TI - CRAWLING DEEP WEB CONTENT THROUGH QUERY FORMS
SN - 978-989-8111-81-4
AU - Liu J.
AU - Wu Z.
AU - Jiang L.
AU - Zheng Q.
AU - Liu X.
PY - 2009
SP - 629
EP - 637
DO - 10.5220/0001830806290637