Compressing Inverted Files using Modified LZW
Vasileios Iosifidis, Christos Makris
2016
Abstract
In the paper, we present a compression algorithm that employs a modification of the well known Ziv Lempel Welch algorithm (LZW); it creates an index that treats terms as characters, and stores encoded document identifier patterns efficiently. We also equip our approach with a set of preprocessing {reassignment of document identifiers, Gaps} and post-processing methods {Gaps, IPC encoding, GZIP} in order to attain more significant space improvements. We used two different combinations of those discrete steps to see which one maximizes the performance of the modification we made on the LZW algorithm. Performed experiments in the Wikipedia dataset depict the superiority in space compaction of the proposed technique.
DownloadPaper Citation
in Harvard Style
Iosifidis V. and Makris C. (2016). Compressing Inverted Files using Modified LZW . In Proceedings of the 12th International Conference on Web Information Systems and Technologies - Volume 1: WEBIST, ISBN 978-989-758-186-1, pages 156-163. DOI: 10.5220/0005857201560163
in Bibtex Style
@conference{webist16,
author={Vasileios Iosifidis and Christos Makris},
title={Compressing Inverted Files using Modified LZW},
booktitle={Proceedings of the 12th International Conference on Web Information Systems and Technologies - Volume 1: WEBIST,},
year={2016},
pages={156-163},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005857201560163},
isbn={978-989-758-186-1},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 12th International Conference on Web Information Systems and Technologies - Volume 1: WEBIST,
TI - Compressing Inverted Files using Modified LZW
SN - 978-989-758-186-1
AU - Iosifidis V.
AU - Makris C.
PY - 2016
SP - 156
EP - 163
DO - 10.5220/0005857201560163