THE SPANISH WEB IN NUMBERS - Main Features of the Spanish Hidden Web

Manuel Álvarez, Fidel Cacheda, Rafael López-García, Víctor M. Prieto

2011

Abstract

This article submits a study about the web sites of the “.es” domains which focuses on the level of use of the technologies that hinder the traversal of the Web to the crawling systems. The study is centred on HTML scripts and forms, since they are two well-known entry points to the “Hidden Web”. For the case of scripts, it pays special attention to redirection and dynamic construction of URLs. The article concludes that a crawler should process those technologies in order to obtain most of the documents of the Web.

Download


Paper Citation


in Harvard Style

Álvarez M., Cacheda F., López-García R. and M. Prieto V. (2011). THE SPANISH WEB IN NUMBERS - Main Features of the Spanish Hidden Web . In Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2011) ISBN 978-989-8425-79-9, pages 363-366. DOI: 10.5220/0003626603710374

in Bibtex Style

@conference{kdir11,
author={Manuel Álvarez and Fidel Cacheda and Rafael López-García and Víctor M. Prieto},
title={THE SPANISH WEB IN NUMBERS - Main Features of the Spanish Hidden Web},
booktitle={Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2011)},
year={2011},
pages={363-366},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003626603710374},
isbn={978-989-8425-79-9},
}


in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2011)
TI - THE SPANISH WEB IN NUMBERS - Main Features of the Spanish Hidden Web
SN - 978-989-8425-79-9
AU - Álvarez M.
AU - Cacheda F.
AU - López-García R.
AU - M. Prieto V.
PY - 2011
SP - 363
EP - 366
DO - 10.5220/0003626603710374