Skip Search Approach for Mining Probabilistic Frequent Itemsets from Uncertain Data

Takahiko Shintani, Tadashi Ohmori, Hideyuki Fujita

2016

Abstract

Due to wider applications of data mining, data uncertainty came to be considered. In this paper, we study mining probabilistic frequent itemsets from uncertain data under the Possible World Semantics. For each tuple has existential probability in probabilistic data, the support of an itemset is a probability mass function (pmf). In this paper, we propose skip search approach to reduce evaluating support pmf for redundant itemsets. Our skip search approach starts evaluating support pmf from the average length of candidate itemsets. When an evaluated itemset is not probabilistic frequent, all its superset of itemsets are deleted from candidate itemsets and its subset of itemset is selected as a candidate itemset to evaluate next. When an evaluated itemset is probabilistic frequent, its superset of itemset is selected as a candidate itemset to evaluate next. Furthermore, our approach evaluates the support pmf by difference calculus using evaluated itemsets. Thus, our approach can reduce the number of candidate itemsets to evaluate their support pmf and the cost of evaluating support pmf. Finally, we show the effectiveness of our approach through experiments.

Download


Paper Citation


in Harvard Style

Shintani T., Ohmori T. and Fujita H. (2016). Skip Search Approach for Mining Probabilistic Frequent Itemsets from Uncertain Data . In Proceedings of the 8th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR, (IC3K 2016) ISBN 978-989-758-203-5, pages 174-180. DOI: 10.5220/0006035401740180

in Bibtex Style

@conference{kdir16,
author={Takahiko Shintani and Tadashi Ohmori and Hideyuki Fujita},
title={Skip Search Approach for Mining Probabilistic Frequent Itemsets from Uncertain Data},
booktitle={Proceedings of the 8th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR, (IC3K 2016)},
year={2016},
pages={174-180},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0006035401740180},
isbn={978-989-758-203-5},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 8th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR, (IC3K 2016)
TI - Skip Search Approach for Mining Probabilistic Frequent Itemsets from Uncertain Data
SN - 978-989-758-203-5
AU - Shintani T.
AU - Ohmori T.
AU - Fujita H.
PY - 2016
SP - 174
EP - 180
DO - 10.5220/0006035401740180