AUTOMATIC IDENTIFICATION OF SPECIFIC WEB DOCUMENTS BY USING CENTROID TECHNIQUE

Udomsit Sukakanya, Kriengkrai Porkaew

2005

Abstract

In order to reduce time to find specific information from high volume of information on the Web, this paper proposes the implementation of an automatic identification of specific Web documents by using centroid technique. The Initial training sets in this experiment are 4113 Thai e-Commerce Web documents. After training process, the system gets a Centroid e-Commerce vector. In order to evaluate the system, six test sets were taken under consideration. In each test set has 100 Web pages both known e-Commerce and non e-Commerce Web pages. The average system performance is about 90%.

Download


Paper Citation


in Harvard Style

Sukakanya U. and Porkaew K. (2005). AUTOMATIC IDENTIFICATION OF SPECIFIC WEB DOCUMENTS BY USING CENTROID TECHNIQUE . In Proceedings of the First International Conference on Web Information Systems and Technologies - Volume 1: WEBIST, ISBN 972-8865-20-1, pages 333-338. DOI: 10.5220/0001234903330338

in Bibtex Style

@conference{webist05,
author={Udomsit Sukakanya and Kriengkrai Porkaew},
title={AUTOMATIC IDENTIFICATION OF SPECIFIC WEB DOCUMENTS BY USING CENTROID TECHNIQUE},
booktitle={Proceedings of the First International Conference on Web Information Systems and Technologies - Volume 1: WEBIST,},
year={2005},
pages={333-338},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0001234903330338},
isbn={972-8865-20-1},
}


in EndNote Style

TY - CONF
JO - Proceedings of the First International Conference on Web Information Systems and Technologies - Volume 1: WEBIST,
TI - AUTOMATIC IDENTIFICATION OF SPECIFIC WEB DOCUMENTS BY USING CENTROID TECHNIQUE
SN - 972-8865-20-1
AU - Sukakanya U.
AU - Porkaew K.
PY - 2005
SP - 333
EP - 338
DO - 10.5220/0001234903330338