FINDING SUITABLE KEYWORDS FOR A WEB PAGE FROM CACHES BASED ON SIMILARITY AND FREQUENCY
Yasuhiro Tajima, Yoshiyuki Kotani
2007
Abstract
Meta data are most important entry in a web page for summarization, indexing, and so on. Unfortunately, there are many kind of matadata item but there are few guidelines for construct the metadata for a web page. We propose an metadata finding method for a web page by searching the internet caches and selecting suitable items for the target page. Our method is based on a bayesian method which is used in the area of text retrieval. We evaluate this method by an experiment to find a set of suitable keywords for a source web page. Compareing the original metatagged keywords and the system output, we obtain 74% precision and 76% recall. We can conclude that this method finds the tendency of metadata which is annotated to the pages similar to the target page.
DownloadPaper Citation
in Harvard Style
Tajima Y. and Kotani Y. (2007). FINDING SUITABLE KEYWORDS FOR A WEB PAGE FROM CACHES BASED ON SIMILARITY AND FREQUENCY . In Proceedings of the Third International Conference on Web Information Systems and Technologies - Volume 2: WEBIST, ISBN 978-972-8865-78-8, pages 474-477. DOI: 10.5220/0001289204740477
in Bibtex Style
@conference{webist07,
author={Yasuhiro Tajima and Yoshiyuki Kotani},
title={FINDING SUITABLE KEYWORDS FOR A WEB PAGE FROM CACHES BASED ON SIMILARITY AND FREQUENCY},
booktitle={Proceedings of the Third International Conference on Web Information Systems and Technologies - Volume 2: WEBIST,},
year={2007},
pages={474-477},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0001289204740477},
isbn={978-972-8865-78-8},
}
in EndNote Style
TY - CONF
JO - Proceedings of the Third International Conference on Web Information Systems and Technologies - Volume 2: WEBIST,
TI - FINDING SUITABLE KEYWORDS FOR A WEB PAGE FROM CACHES BASED ON SIMILARITY AND FREQUENCY
SN - 978-972-8865-78-8
AU - Tajima Y.
AU - Kotani Y.
PY - 2007
SP - 474
EP - 477
DO - 10.5220/0001289204740477