XPACK: A HIGH-PERFORMANCE WEB DOCUMENT ENCODING

Daniel Rocco, James Caverlee, Ling Liu

2005

Abstract

XML is an increasingly popular data storage and exchange format whose popularity can be attributed to its self-describing syntax, acceptance as a data transmission and archival standard, strong internationalization support, and a plethora of supporting tools and technologies. However, XML’s verbose, repetitive, text-oriented document specification syntax is a liability for many emerging applications such as mobile computing and distributed document dissemination. This paper presents XPack, an efficient XML document compression system that exploits information inherent in the document structure to enhance compression quality. Additionally, the utilization of XML structure features in XPack’s design should provide valuable support for structure-aware queries over compressed documents. Taken together, the techniques employed in the XPack compression scheme provide a foundation for efficiently storing, transmitting, and operating over Web documents. Initial experimental results demonstrate that XPack can reduce the storage requirements for Web documents by up to 20% over previous XML compression techniques. More significantly, XPack can simultaneously support operations over the documents, providing up to two orders of magnitude performance improvement for certain document operations when compared to equivalent operations on unencoded XML documents.

Download


Paper Citation


in Harvard Style

Rocco D., Caverlee J. and Liu L. (2005). XPACK: A HIGH-PERFORMANCE WEB DOCUMENT ENCODING . In Proceedings of the First International Conference on Web Information Systems and Technologies - Volume 1: WEBIST, ISBN 972-8865-20-1, pages 32-39. DOI: 10.5220/0001233000320039

in Bibtex Style

@conference{webist05,
author={Daniel Rocco and James Caverlee and Ling Liu},
title={XPACK: A HIGH-PERFORMANCE WEB DOCUMENT ENCODING},
booktitle={Proceedings of the First International Conference on Web Information Systems and Technologies - Volume 1: WEBIST,},
year={2005},
pages={32-39},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0001233000320039},
isbn={972-8865-20-1},
}


in EndNote Style

TY - CONF
JO - Proceedings of the First International Conference on Web Information Systems and Technologies - Volume 1: WEBIST,
TI - XPACK: A HIGH-PERFORMANCE WEB DOCUMENT ENCODING
SN - 972-8865-20-1
AU - Rocco D.
AU - Caverlee J.
AU - Liu L.
PY - 2005
SP - 32
EP - 39
DO - 10.5220/0001233000320039