Web Forums Change Analysis

Tomasz Kaczmarek, Dawid Grzegorz Węckowski

2013

Abstract

In this paper we present results from an experiment conducted on over 27 900 web pages gathered every 2 hours over 22 days from 16 forums (4256 independent crawls), to investigate how these web pages evolve over time. The results of the experiment became a basis for design choices for a focused incremental crawler, that will be specialized for efficient gathering of documents from web forums, maintaining high freshness of the local collection of obtained pages. The data analysis shows, that forums differ from generic web portals and identifying places in the source navigational structure, where new documents occur more often, would allow to improve the crawler’s performance and the collection freshness.

Download


Paper Citation


in Harvard Style

Kaczmarek T. and Węckowski D. (2013). Web Forums Change Analysis . In Proceedings of the 9th International Conference on Web Information Systems and Technologies - Volume 1: WEBIST, ISBN 978-989-8565-54-9, pages 105-110. DOI: 10.5220/0004373201050110

in Bibtex Style

@conference{webist13,
author={Tomasz Kaczmarek and Dawid Grzegorz Węckowski},
title={Web Forums Change Analysis},
booktitle={Proceedings of the 9th International Conference on Web Information Systems and Technologies - Volume 1: WEBIST,},
year={2013},
pages={105-110},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004373201050110},
isbn={978-989-8565-54-9},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 9th International Conference on Web Information Systems and Technologies - Volume 1: WEBIST,
TI - Web Forums Change Analysis
SN - 978-989-8565-54-9
AU - Kaczmarek T.
AU - Węckowski D.
PY - 2013
SP - 105
EP - 110
DO - 10.5220/0004373201050110