Discovering the Deep Web through XML Schema Extraction

Yasser Saissi, Ahmed Zellou, Ali Idri

2016

Abstract

The web accessible by the search engines contains a vast amount of information. However, there is another part of the web called the deep web accessible only through its associated HTML forms, and containing much more information. The integration of the deep web content presents many challenges that are not fully addressed by the actual deep web access approaches. The integration of the deep web data requires knowing the schema describing each deep web source. This paper presents our approach to extract the XML schema describing a selected deep web source. The XML schema extracted will be used to integrate the associated deep web source into a mediation system. The principle of our approach is to apply a static and a dynamic analysis to the HTML forms giving access to the selected deep web source. We describe the algorithms of our approach and compare it to the other existing approaches.

Download


Paper Citation


in Harvard Style

Saissi Y., Zellou A. and Idri A. (2016). Discovering the Deep Web through XML Schema Extraction . In Proceedings of the 8th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR, (IC3K 2016) ISBN 978-989-758-203-5, pages 141-149. DOI: 10.5220/0006013901410149

in Bibtex Style

@conference{kdir16,
author={Yasser Saissi and Ahmed Zellou and Ali Idri},
title={Discovering the Deep Web through XML Schema Extraction},
booktitle={Proceedings of the 8th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR, (IC3K 2016)},
year={2016},
pages={141-149},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0006013901410149},
isbn={978-989-758-203-5},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 8th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR, (IC3K 2016)
TI - Discovering the Deep Web through XML Schema Extraction
SN - 978-989-758-203-5
AU - Saissi Y.
AU - Zellou A.
AU - Idri A.
PY - 2016
SP - 141
EP - 149
DO - 10.5220/0006013901410149