FINDING SUITABLE KEYWORDS FOR A WEB PAGE FROM CACHES BASED ON SIMILARITY AND FREQUENCY

Yasuhiro Tajima, Yoshiyuki Kotani

2007

Abstract

Meta data are most important entry in a web page for summarization, indexing, and so on. Unfortunately, there are many kind of matadata item but there are few guidelines for construct the metadata for a web page. We propose an metadata finding method for a web page by searching the internet caches and selecting suitable items for the target page. Our method is based on a bayesian method which is used in the area of text retrieval. We evaluate this method by an experiment to find a set of suitable keywords for a source web page. Compareing the original metatagged keywords and the system output, we obtain 74% precision and 76% recall. We can conclude that this method finds the tendency of metadata which is annotated to the pages similar to the target page.

Download


Paper Citation


in Harvard Style

Tajima Y. and Kotani Y. (2007). FINDING SUITABLE KEYWORDS FOR A WEB PAGE FROM CACHES BASED ON SIMILARITY AND FREQUENCY . In Proceedings of the Third International Conference on Web Information Systems and Technologies - Volume 2: WEBIST, ISBN 978-972-8865-78-8, pages 474-477. DOI: 10.5220/0001289204740477

in Bibtex Style

@conference{webist07,
author={Yasuhiro Tajima and Yoshiyuki Kotani},
title={FINDING SUITABLE KEYWORDS FOR A WEB PAGE FROM CACHES BASED ON SIMILARITY AND FREQUENCY},
booktitle={Proceedings of the Third International Conference on Web Information Systems and Technologies - Volume 2: WEBIST,},
year={2007},
pages={474-477},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0001289204740477},
isbn={978-972-8865-78-8},
}


in EndNote Style

TY - CONF
JO - Proceedings of the Third International Conference on Web Information Systems and Technologies - Volume 2: WEBIST,
TI - FINDING SUITABLE KEYWORDS FOR A WEB PAGE FROM CACHES BASED ON SIMILARITY AND FREQUENCY
SN - 978-972-8865-78-8
AU - Tajima Y.
AU - Kotani Y.
PY - 2007
SP - 474
EP - 477
DO - 10.5220/0001289204740477