EFFICIENT MANAGEMENT OF NON REDUNDANT RULES IN LARGE PATTERN BASES: A BITMAP APPROACH

François Jacquenet, Christine Largeron, Cédric Udréa

2006

Abstract

Knowledge Discovery from Databases has more and more impact nowadays and various tools are now available to extract efficiently (in time and memory space) some knowledge from huge databases. Nevertheless, those systems generally produce some large pattern bases and then the management of these one rapidly becomes untractable. Few works have focused on pattern base management systems and researches on that domain are really new. This paper comes within that context, dealing with a particular class of patterns that is association rules. More precisely, we present the way we have efficiently implemented the search for non redundant rules thanks to a representation of rules in the form of bitmap arrays. Some experiments show that the use of this technique increases dramatically the gain in time and space, allowing us to manage large pattern bases.

References

  1. Aggarwal, C. C. and Yu, P. S. (1998). Online generation of association rules. In Proceedings of the 14th Conference on Data Engineering, pages 402-411, Orlando.
  2. Agrawal, R., Imielinski, T., and Swami, A. (1993). Mining association rules between sets of items in large databases. In Proceedings of the ACM SIGMOD International Conference on Management of Data, pages 207-216, Washington D.C.
  3. Albert-Lorincz, H. and Boulicaut, J.-F. (2003). Mining frequent sequential patterns under regular expressions: A highly adaptive strategy for pushing contraints. In Proceedings of the Third SIAM International Conference on Data Mining, San Francisco, CA, USA, May 1-3, 2003.
  4. Bastide, Y., Pasquier, N., Taouil, R., Stumme, G., and Lakhal, L. (2000). Mining minimal non-redundant association rules using frequent closed itemsets. In Proceedings of the first International Conference on Computational Logic, LNCS 1861, pages 972-986.
  5. Bayardo, R., Agrawal, R., and Gunopulos, D. (2000). Constraint-based rule mining in large, dense databases. Data Mining and Knowledge Discovery, 4(2/3):217-240.
  6. Boulicaut, J. and Jeudy, B. (2005). Constraint-based data mining. In The Data Mining and Knowledge Discovery Handbook, pages 399-416. Springer.
  7. Boulicaut, J. F. (2005). Condensed representations for data mining. In Encyclopedia of Data Warehousing and Mining, pages 207-211. Idea Group Reference.
  8. Boulicaut, J. F. and Masson, C. (2005). Data mining query languages. In The Data Mining and Knowledge Discovery Handbook, pages 715-727. Springer.
  9. Catania, B., Maddalena, A., Mazza, M., Bertino, E., and Rizzi, S. (2004). A framework for data mining pattern management. In Proceedings of the 8th European Conference on Principles and Practice of Knowledge Discovery in Databases, LNCS 3202, pages 87-98.
  10. Freitas, A. A. (1999). On rule interestingness measures. Knowledge-Based Systems, 12(5-6):309-315.
  11. Garofalakis, M. N., Rastogi, R., and Shim, K. (2002). Mining sequential patterns with regular expression constraints. IEEE Transactions on Knowledge and Data Engineering, 14(3):530-552.
  12. Goethals, B., Muhonen, J., and Toivonen, H. (2005). Mining non-derivable association rules. In Proceedings of the fifth SIAM International Conference on Data Mining, Newport Beach, California, USA, April 21-23.
  13. Grossman, R. L., Bailey, S., Ramu, A., Malhi, B., Hallstrom, P., Pulleyn, I., and Qin, X. (1999). The management and mining of multiple predictive models using the predictive model markup language (PMML). In Information and Software Technology, volume 41, pages 589-595.
  14. Li, G. and Hamilton, H. (2004). Basic association rules. In Proceedings of the fourth SIAM International Conference on Data Mining, Lake Buena Vista, Florida, USA, April 22-24. SIAM.
  15. Li, Y., Liu, Z. T., Chen, L., Cheng, W., and Xie, C. H. (2004). Extracting minimal non-redundant association rules from QCIL. In International Conference on Computer and Information Technology, pages 986- 991. IEEE Computer Society.
  16. Louie, E. and Lin, T. Y. (2000). Finding association rules using fast bit computation: Machine-oriented modeling. In Proceedings of the 12th International Symposium on Methodologies for Intelligent Systems, LNCS 1932, pages 486-494. Springer.
  17. Masson, C., Robardet, C., and Boulicaut, J. F. (2004). Optimizing subset queries: a step towards sql-based inductive databases for itemsets. In Proceedings of the 2004 ACM symposium on Applied computing (SAC'04), pages 535-539. ACM Press.
  18. Morzy, T. and Zakrzewicz, M. (1998). Group bitmap index: A structure for association rules retrieval. In Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining, pages 284- 288. AAAI Press.
  19. Pasquier, N., Taouil, R., Bastide, Y., Stumme, G., and Lakhal, L. (2005). Generating a condensed representation for association rules. Journal of Intelligent Information Systems, 24(1):29-60.
  20. Pucheral, P., Gardarin, G., and Wu, L. (1998). Bitmap based algorithms for mining association rules. In Actes des Journées Bases de Données Avancées (BDA'98).
  21. Tuzhilin, A. and Liu, B. (2002). Querying multiple sets of discovered rules. In Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 52-60. ACM.
  22. Wang, J. and Han, J. (2004). BIDE: Efficient mining of frequent closed sequences. In Proceedings of the 20th International Conference on Data Engineering, ICDE 2004, 30 March - 2 April 2004, Boston, MA, USA, pages 79-90.
  23. Wang, J., Han, J., Lu, Y., and Tzvetkov, P. (2005). TFP: An efficient algorithm for mining top-k frequent closed itemsets. IEEE Transactions on Knowledge and Data Engineering, 17(5):652-664.
  24. Zaki, M. J. (2000). Generating non-redundant association rules. In Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining, August 20-23, 2000, Boston, MA, USA, pages 34-43.
  25. Zaki, M. J. (2001). SPADE: An efficient algorithm for mining frequent sequences. Machine Learning, 42(1/2):31-60.
  26. Zaki, M. J. (2004). Mining non-redundant association rules. Data Mining and Knowledge Discovery, 9(3):223- 248.
  27. Zaki, M. J. and Hsiao, C. (2005). Efficient algorithms for mining closed itemsets and their lattice structure. IEEE Transactions on Knowledge and Data Engineering, 17(4):462-478.
  28. Zaki, M. J., Parimi, N., De, N., Gao, F., Phoophakdee, B., Urban, J., Chaoji, V., Hasan, M. A., and Salem, S. (2005). Towards generic pattern mining. In Proceedings of the Third International Conference on Formal Concept Analysis, pages 1-20.
Download


Paper Citation


in Harvard Style

Jacquenet F., Largeron C. and Udréa C. (2006). EFFICIENT MANAGEMENT OF NON REDUNDANT RULES IN LARGE PATTERN BASES: A BITMAP APPROACH . In Proceedings of the Eighth International Conference on Enterprise Information Systems - Volume 2: ICEIS, ISBN 978-972-8865-42-9, pages 208-215. DOI: 10.5220/0002495902080215


in Bibtex Style

@conference{iceis06,
author={François Jacquenet and Christine Largeron and Cédric Udréa},
title={EFFICIENT MANAGEMENT OF NON REDUNDANT RULES IN LARGE PATTERN BASES: A BITMAP APPROACH},
booktitle={Proceedings of the Eighth International Conference on Enterprise Information Systems - Volume 2: ICEIS,},
year={2006},
pages={208-215},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0002495902080215},
isbn={978-972-8865-42-9},
}


in EndNote Style

TY - CONF
JO - Proceedings of the Eighth International Conference on Enterprise Information Systems - Volume 2: ICEIS,
TI - EFFICIENT MANAGEMENT OF NON REDUNDANT RULES IN LARGE PATTERN BASES: A BITMAP APPROACH
SN - 978-972-8865-42-9
AU - Jacquenet F.
AU - Largeron C.
AU - Udréa C.
PY - 2006
SP - 208
EP - 215
DO - 10.5220/0002495902080215