HIGH CANDIDATES GENERATION: A NEW EFFICIENT METHOD FOR MINING SHARE-FREQUENT PATTERNS

Authors

  • Chayanan Nawapornanan Department of Computer Science, Faculty of Science, King Mongkut's Institute of Technology Ladkrabang, Bangkok 10520, Thailand
  • Sarun Intakosum Department of Computer Science, Faculty of Science, King Mongkut's Institute of Technology Ladkrabang, Bangkok 10520, Thailand
  • Veera Boonjing International College, King Mongkut's Institute of Technology Ladkrabang, Bangkok 10520, Thailand

DOI:

https://doi.org/10.11113/jt.v79.10292

Keywords:

Data mining, association rule mining, knowledge discovering, share-frequent patterns mining, frequent patterns mining, frequent itemsets mining

Abstract

The share frequent patterns mining is more practical than the traditional frequent patternset mining because it can reflect useful knowledge such as total costs and profits of patterns. Mining share-frequent patterns becomes one of the most important research issue in the data mining. However, previous algorithms extract a large number of candidate and spend a lot of time to generate and test a large number of useless candidate in the mining process. This paper proposes a new efficient method for discovering share-frequent patterns. The new method reduces a number of candidates by generating candidates from only high transaction-measure-value patterns. The downward closure property of transaction-measure-value patterns assures correctness of the proposed method. Experimental results on dense and sparse datasets show that the proposed method is very efficient in terms of execution time. Also, it decreases the number of generated useless candidates in the mining process by at least 70%.

References

Agarwal, R., Imielinski, T. and Swami, A. 1993. Mining Association Rules between Sets of Items in Large Database. Proceedings of the ACM SIGMOD on Management of Data. 207-216.

Agarwal, R. and Srikant, R. 1994. Fast Algorithms for Mining Association Rules in Large Databases. Proceedings of 20th International Conference on Very Large Data Bases. 487-499.

Agarwal, R. and Srikant, R. 1995. Mining Sequential Patterns. Proceedings of 11th International Conference on Data Engineering. 3-14.

Agarwal, R. and Srikant, R. 1996. Mining Sequential Patterns: Generalizations and Performance Improvements. Proceedings of 5th International Conference on Extending Database Technology. 3-17.

Carter, C. L., Hamilton, H. J. and Cercone, N. 1997. Share Based Measures for Itemsets. Lecture Notes in Computer Science. 1263: 14-24.

Park, J. S., Chen, M. S. and Yu, P. S. 1997. Using a Hash-Based Method with Transaction Trimming for Mining Association Rules. IEEE Transactions on Knowledge and Data Engineering. 9: 813-825.

Brin, S., Motwani, R., Ullman, J. D. and Tsur, S. 1997. Dynamic Itemset Counting and Implication Rules for Market Basket Data. Proceedings of the ACM SIGMOD on Management of Data. 255-264.

Barber, B. and Hamilton, H. J. 2000. Algorithms for Mining Share Frequent Itemsets Containing Infrequent Subsets. Lecture Notes in Computer Science. 1910: 316-324.

Han, J., Pei, J. and Yin, Y. 2000. Mining Frequent Patterns without Candidate Generation. Proceedings of the ACM SIGMOD on Management of Data. 1-12.

Barber, B. and Hamilton, H. J. 2001. Parametric Algorithm for Mining Share Frequent Itemsets. Journal of Intelligent Information Systems. 16: 277-293.

Pei, J., Han, J. and Lu, H. 2001. Hmine: Hyper-structure Mining of Frequent Patterns in Large Database. Proceedings of International Conference on Data Mining. 441-448.

Agarwal, R., Aggarwal, C. and Prasad, V. V. V. 2001. A Tree Projection Algorithm for Generation of Frequent Itemsets. Journal of Parallel and Distributed. 61: 350-371.

Barber, B. and Hamilton, H. J. 2003. Extracting Share Frequent Itemsets with Infrequent Subsets. Data Mining and Knowledge Discovery. 7: 153-185.

Han, J., Pei, J., Yin, Y. and Shi, C. 2004. Integrating Classification and Association Rule Mining: A Concept Lattice Framework. Lecture Notes in Computer Science. 1711: 443-447.

El-Hajj, M. and Zaiane, O. R. 2004. COFI Approach for Mining Frequent Itemsets Revisited. Proceeding of the ACM SIGMOD on Data Mining and Knowledge Discovery. 70-75.

Li, Y. C., Yeh, J. S. and Chang, C. C. 2005. A Fast Algorithm for Mining Share-frequent Itemsets. Lecture Notes in Computer Science. 3399: 417-428.

Li, Y. C., Yeh, J. S. and Chang, C. C. 2005. Direct Candidates Generation: A Novel Algorithm for Discovering Complete Share-frequent Itemsets. Lecture Notes in Computer Science. 3614: 551-560.

Nawapornanan, C and Boonjing, V. 2011. A New Share Frequent Itemsets Mining Using Incremental Bittable Knowledge. Proceedings of 5th International Conference on Computer Sciences and Convergence Information Technology. 358-362.

Mohammad, N. Q., Hassan, F. H. A., Yahya, K. T. 2015. An Improved Documents Classification Technique Using Association Rules Mining. Proceedings of IEEE International Conference on Research in Computational Intelligence and Communication Networks. 460-465.

Houda, E., Mohamed, E. F. and Mohammed, E. M. 2016. A Novel Approach for Mining Frequent Itemsets: AprioriMin. Proceedings of 4th IEEE International Colloquium on Information Science and Technology. 286-289.

Peng, H. 2016. Improved Algorithm Based on Sequential Pattern Mining of Big Data Set. Proceedings of 7th IEEE International Conference on Software Engineering and Service Science. 115-118.

Shubhangi, D. P., Ratnadeep, R. D. and D., K. K. 2016. Adaptive Apriori Algorithm for Frequent Itemset Mining. Proceedings of International Conference System Modeling & Advancement in Research Trends. 7-13.

Frequent Itemset Mining Dataset Repository, “Chessâ€, http://fimi.ua.ac.be/data/, 1987 (Accessed: June 9, 2017).

Frequent Itemset Mining Dataset Repository, “Mushroomâ€, http://fimi.ua.ac.be/data/, 1989 (Accessed: June 9, 2017

Frequent Itemset Mining Dataset Repository, “T10I4D100Kâ€, http://fimi.ua.ac.be/data/, 2003 ((Accessed: June 9, 2017).

Frequent Itemset Mining Dataset Repository, “T40I10D100Kâ€, http://fimi.ua.ac.be/data/, 2003 (Accessed: June 9, 2017).

Downloads

Published

2017-10-22

Issue

Section

Science and Engineering

How to Cite

HIGH CANDIDATES GENERATION: A NEW EFFICIENT METHOD FOR MINING SHARE-FREQUENT PATTERNS. (2017). Jurnal Teknologi, 79(7). https://doi.org/10.11113/jt.v79.10292