Exploiting Progressions for Improving Inverted Index Compression

Christos Makris, Yannis Plegas

2013

Abstract

In this paper, we present an algorithmic technique for compressing the document identifier lists of an inverted index which can be combined with state of the art compression techniques such as algorithms from the PForDelta family or Interpolative Coding and attain significant space compaction gains. The new technique initially converts the lists of document identifiers to a set of arithmetic progressions; the representation of an arithmetic progression uses at most three (!) numbers. This process produces an overhead which are the multiple identifiers that have to be assigned to the documents so we have to use a secondary inverted index and an extra compression step (PForDelta or Interpolative Coding) to represent it. Performed experiments in the ClueWeb09 dataset depict the superiority in space compaction of the proposed technique.

Download


Paper Citation


in Harvard Style

Makris C. and Plegas Y. (2013). Exploiting Progressions for Improving Inverted Index Compression . In Proceedings of the 9th International Conference on Web Information Systems and Technologies - Volume 1: WEBIST, ISBN 978-989-8565-54-9, pages 251-256. DOI: 10.5220/0004365402510256

in Bibtex Style

@conference{webist13,
author={Christos Makris and Yannis Plegas},
title={Exploiting Progressions for Improving Inverted Index Compression},
booktitle={Proceedings of the 9th International Conference on Web Information Systems and Technologies - Volume 1: WEBIST,},
year={2013},
pages={251-256},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004365402510256},
isbn={978-989-8565-54-9},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 9th International Conference on Web Information Systems and Technologies - Volume 1: WEBIST,
TI - Exploiting Progressions for Improving Inverted Index Compression
SN - 978-989-8565-54-9
AU - Makris C.
AU - Plegas Y.
PY - 2013
SP - 251
EP - 256
DO - 10.5220/0004365402510256