可快速探勘封閉頻繁物品集且毋需產生中繼物之演算法

Tzu-Yun Chen; 陳子筠

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/62085

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	陳健輝(Gen-Huey Chen)
dc.contributor.author	Tzu-Yun Chen	en
dc.contributor.author	陳子筠	zh_TW
dc.date.accessioned	2021-06-16T13:26:56Z	-
dc.date.available	2016-07-26
dc.date.copyright	2013-07-26
dc.date.issued	2013
dc.date.submitted	2013-07-23
dc.identifier.citation	[1] R. Agrawal, T. Imielinski, and A. Swami, 'Mining association rules between sets of items in large databases,' Proc. the ACM SIGMOD International Conference on Management of Data, Washington, D.C., USA, 1993. [2] R. Agrawal and R. Srikant, 'Fast algorithms for mining association rules,' Proc. the 20th International Conference Very Large Data Bases, VLDB, 1994, pp. 487- 499. [3] J. Han, J. Pei, Y. Yin, and R. Mao, 'Mining frequent patterns without candidate generation: a frequent-pattern tree approach,' Data Mining and Knowledge Discovery, vol. 8, pp. 53-87, 2004. [4] N. Pasquier, Y. Bastide, R. Taouil, and L. Lakhal, 'Discovering frequent closed itemsets for association rules,' Database Theory - ICDT’99, Springer, pp. 398- 416, 1999. [5] M. J. Zaki and C.-J. Hsiao, 'CHARM: an efficient algorithm for closed itemset mining,' Proc. the 2nd SIAM International Conference on Data Mining, pp. 457-473, 2002. [6] C. Borgelt, X. Yang, R. Nogales-Cadenas, P. Carmona-Saez, and A. Pascual- Montano, 'Finding closed frequent item sets by intersecting transactions,' Proc. the 14th International Conference on Extending Database Technology, Uppsala, Sweden, 2011. [7] R. Agrawal and R. Srikant, 'Mining sequential patterns,' Proc. the 11th International Conference on Data Engineering, , pp. 3-14, 1995. [8] J. Han, J. Pei, B. Mortazavi-Asl, Q. Chen, U. Dayal, and M.-C. Hsu, 'FreeSpan: frequent pattern-projected sequential pattern mining,' Proc. the 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Boston, Massachusetts, USA, 2000. [9] J. Pei, J. Han, B. Mortazavi-Asl, J. Wang, H. Pinto, Q. Chen, U. Dayal, and M.-C. Hsu, 'Mining sequential patterns by pattern-growth: the prefixspan approach,' IEEE Transactions on Knowledge and Data Engineering , vol. 16, pp. 1424-1440, 2004. [10] J. Wang, J. Han, and C. Li, 'Frequent closed sequence mining without candidate maintenance,' IEEE Transactions on Knowledge and Data Engineering, vol. 19, pp. 1042-1056, 2007. [11] Y. Liu, W.-K. Liao, and A. Choudhary, 'A fast high utility itemsets mining algorithm,' Proc. the 1st International Workshop on Utility-based Data Mining, Chicago, Illinois, 2005. [12] V. S. Tseng, C.-W. Wu, B.-E. Shie, and P. S. Yu, 'UP-Growth: an efficient algorithm for high utility itemset mining,' Proc. the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, 2010. [13] M. Liu and J. Qu, 'Mining high utility itemsets without candidate generation,' Proc. the 21st ACM International Conference on Information and Knowledge Management, Maui, Hawaii, USA, 2012. [14] C. F. Ahmed, S. K. Tanbeer, B.-S. Jeong, and Y.-K. Lee, 'Efficient tree structures for high utility pattern mining in incremental databases,' IEEE Transactions on Knowledge and Data Engineering, vol. 21, pp. 1708-1721, 2009. [15] V. S. Tseng, B.-E. Shie, C.-W. Wu, and P. S. Yu, 'Efficient algorithms for mining high utility itemsets from transactional databases,' IEEE Transactions on Knowledge and Data Engineering, pp. 1-14, 2012. [16] J. Liu, K. Wang, and B. Fung, 'Direct discovery of high utility itemsets without candidate generation,' Proc. the 12th IEEE International Conference on Data Mining (ICDM), pp. 984-989, 2012. [17] C. W. Wu, P. Fournier-Viger, P. S. Yu, and V. S. Tseng, 'Efficient mining of a concise and lossless representation of high utility itemsets,' Proc. the 11th IEEE International Conference on Data Mining (ICDM), pp. 824-833, 2011. [18] C.-H. Lee, C.-R. Lin, and M.-S. Chen, 'Sliding-window filtering: an efficient algorithm for incremental mining,' Proc. the 10th International Conference on Information and Knowledge Management, Atlanta, Georgia, USA, 2001. [19] C.-Y. Chang, M.-S. Chen, and C.-H. Lee, 'Mining general temporal association rules for items with different exhibition periods,' Proc. the IEEE International Conference on Data Mining(ICDM), pp. 59-66, 2002. [20] C.-H. Lee, M.-S. Chen, and C.-R. Lin, 'Progressive partition miner: an efficient algorithm for mining general temporal association rules,' IEEE Transactions on Knowledge and Data Engineering, vol. 15, pp. 1004-1017, 2003. [21] J.-W. Huang, B.-R. Dai, and M.-S. Chen, 'Twain: two-end association miner with precise frequent exhibition periods,' ACM Transactions Knowledge Discovery Data, vol. 1, 2007. [22] X. Liu, J. Guan, and P. Hu, 'Mining frequent closed itemsets from a landmark window over online data streams,' Computers & Mathematics with Applications, vol. 57, pp. 927-936, 2009. [23] H.-F. Li and S.-Y. Lee, 'Mining frequent itemsets over data streams using efficient window sliding techniques,' Expert Systems with Applications, vol. 36, pp. 1466-1477, 2009. [24] S. K. Tanbeer, C. F. Ahmed, B.-S. Jeong, and Y.-K. Lee, 'Sliding window-based frequent pattern mining over data streams,' Information sciences, vol. 179, pp. 3843-3865, 2009. [25] T. F. Gharib, H. Nassar, M. Taha, and A. Abraham, 'An efficient algorithm for incremental mining of temporal association rules,' Data & Knowledge Engineering, vol. 69, pp. 800-815, 2010. [26] G. S. Manku and R. Motwani, 'Approximate frequency counts over data streams,' Proc. the 28th International Conference on Very Large Data Bases, Hong Kong, China, 2002. [27] H.-F. Li, C.-C. Ho, and S.-Y. Lee, 'Incremental updates of closed frequent itemsets over continuous data streams,' Expert Systems with Applications, vol. 36, pp. 2451-2458, 2009. [28] B.-E. Shie, V. S. Tseng, and P. S. Yu, 'Online mining of temporal maximal utility itemsets from data streams,' Proc. the 2010 ACM Symposium on Applied Computing, Sierre, Switzerland, 2010. [29] C. W. Wu, B.-E. Shie, V. S. Tseng, and P. S. Yu, 'Mining top-k high utility itemsets,' Proc. the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Beijing, China, 2012. [30] H. S. L. Shen, P. Pritchard and R. Topor, 'Finding the N largest itemsets,' Proc. International Conference on Data Mining, pp. 211-222, 1998. [31] A. W.-C. Fu, R. W.-W. Kwong, and J. Tang, 'Mining N-most interesting itemsets,' Foundations of Intelligent Systems, Springer, pp. 59-67, 2010. [32] J. Han, J. Wang, Y. Lu, and P. Tzvetkov, 'Mining top-k frequent closed patterns without minimum support,' Proc. the IEEE International Conference on Data Mining( ICDM), pp. 211-218, 2002. [33] Y. Hirate, E. Iwahashi, and H. Yamana, 'TF2P-growth: an efficient algorithm for mining frequent patterns without any thresholds,' Proc. the IEEE ICDM 2004 Workshop on Alternative Techniques for Data Mining and Knowledge Discovery, 2004. [34] Y.-L. Cheung and A. W.-C. Fu, 'Mining frequent itemsets without support threshold: with and without item constraints,' IEEE Transactions on Knowledge and Data Engineering, vol. 16, pp. 1052-1069, 2004. [35] J. Wang, J. Han, Y. Lu, and P. Tzvetkov, 'TFP: an efficient algorithm for mining top-k frequent closed itemsets,' IEEE Transactions on Knowledge and Data Engineering, vol. 17, pp. 652-663, 2005. [36] S.-C. Ngan, T. Lam, R. C.-W. Wong, and A. W.-C. Fu, 'Mining N-most interesting itemsets without support threshold by the COFI-tree,' International Journal of Business Intelligence and Data Mining, vol. 1, pp. 88-106, 2005. [37] T. M. Quang, S. Oyanagi, and K. Yamazaki, 'ExMiner: an efficient algorithm for mining top-k frequent patterns,' Advanced Data Mining and Applications, Springer, pp. 436-447, 2006. [38] K.-T. Chuang, J.-L. Huang, and M.-S. Chen, 'Mining top-k frequent patterns in the presence of the memory constraint,' The VLDB Journal, vol. 17, pp. 1321- 1344, 2008. [39] Y. Xie and P. S. Yu, 'Max-Clique: a top-down graph-based approach to frequent pattern mining,' Proc. the 10th IEEE International Conference on Data Mining (ICDM), pp. 1139-1144, 2010. [40] J. Wang, J. Han, and J. Pei, 'CLOSET+: searching for the best strategies for mining frequent closed itemsets,' Proc. the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington, D.C., 2003. [41] D. Burdick, M. Calimlim, and J. Gehrke, 'MAFIA: a maximal frequent itemset algorithm for transactional databases,' Proc. the 17th International Conference on Data Engineering, pp. 443-452, 2001. [42] J. Pei, J. Han, and R. Mao, 'CLOSET: an efficient algorithm for mining frequent closed itemsets,' ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, pp. 21-30, 2000.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/62085	-
dc.description.abstract	探勘封閉頻繁物品集在資料探勘的領域中已經被廣泛的研究。自從關聯規則被提出後，探勘頻繁物品集就成為了重要的研究課題，而探勘頻繁物品集的問題在於使用記憶體的空間太多，為了減少記憶體的使用量，探勘封閉頻繁物品集的研究也就問世了。封閉頻繁物品集不僅能夠保留完整的頻繁物品集的資訊，還能減少記憶體的使用量，此外封閉頻繁物品集也能提供完整且無累贅的關聯規則結果。在本篇論文中提出了CLOFI，一個快速探勘封閉頻繁物品集的演算法。CLOFI運用不同條件下的子資料庫保留必要的資訊來產生封閉頻繁物品集，並且使用了two-way extension check技術來檢查找到的物品集是否為封閉頻繁物品集，因為以上兩種技術，CLOFI得以產生封閉頻繁物品集而不用產生中繼物。在評估實驗的部份會拿CLOFI和之前論文的演算法CLOSET+來做比較，我們用了幾個資料庫來做實驗，其中包含真實和合成的資料庫。最後將實驗結果的時間和記憶體使用量做分析，並且發現其中代表的含義。	zh_TW
dc.description.abstract	Closed frequent itemsets mining has been widely studied in data mining research. Since association rules mining proposed, frequent itemsets mining became an important research issue. The problem of frequent itemsets mining is that the memory consumption is too high. In order to reduce the memory consumption, closed frequent itemsets mining has proposed. Closed frequent itemsets can not only keep the complete information like frequent itemsets but also reduce the consumption of memory. Furthermore, it provides complete and non-redundant results for mining association rules. In this thesis we present CLOFI, a fast algorithm for mining closed frequent itemsets. It uses conditional databases to keep the necessary information for enumerating closed frequent itemsets. It also uses two-way extension check technology to check the finding itemset is closed frequent itemset or not. According to above two techniques, CLOFI can mine closed frequent itemsets without candidate generation. The performance which CLOFI compares with previous work CLOSET+ shows in experiment evaluation. We use several databases including both real and synthetic databases in the experiments. Finally we have shown the significance of the experiment results, including time and memory consumptions.	en
dc.description.provenance	Made available in DSpace on 2021-06-16T13:26:56Z (GMT). No. of bitstreams: 1 ntu-102-R00922156-1.pdf: 4193569 bytes, checksum: b9e50d3f5bd3d1efdc5ccdd5ae4e650e (MD5) Previous issue date: 2013	en
dc.description.tableofcontents	口試委員會審定書 # 誌謝 i 中文摘要 ii ABSTRACT iii CONTENTS iv LIST OF FIGURES vi LIST OF TABLES vii Chapter 1 Introduction 1 1.1 Problem Statement 1 1.2 Research Issues 3 1.3 Mining Closed Frequent Itemsets 5 Chapter 2 Related Work 6 2.1 CHARM 6 2.2 CLOSET+ 7 2.3 BIDE 10 Chapter 3 A Mining Algorithm 14 3.1 Conditional Database 14 3.2 Extension Check 17 3.3 The Algorithm 20 3.4 Comparison 21 Chapter 4 Experiments 23 4.1 Experimental Environments 23 4.2 Experimental Results 26 4.3 Discussion 35 Chapter 5 Conclusion 36 References 37 Appendix A 43 Appendix B 50
dc.language.iso	zh-TW
dc.subject	關聯規則	zh_TW
dc.subject	資料探勘	zh_TW
dc.subject	封閉頻繁物品集	zh_TW
dc.subject	資料探勘	zh_TW
dc.subject	關聯規則	zh_TW
dc.subject	封閉頻繁物品集	zh_TW
dc.subject	association rules	en
dc.subject	data mining	en
dc.subject	closed frequent itemsets	en
dc.subject	association rules	en
dc.subject	data mining	en
dc.subject	closed frequent itemsets	en
dc.title	可快速探勘封閉頻繁物品集且毋需產生中繼物之演算法	zh_TW
dc.title	A Fast Algorithm for Mining Closed Frequent Itemsets without Candidate Generation	en
dc.type	Thesis
dc.date.schoolyear	101-2
dc.description.degree	碩士
dc.contributor.oralexamcommittee	林宣華(Shian-Hua Lin),周信宏(Hsin-Hung Chou)
dc.subject.keyword	資料探勘,關聯規則,封閉頻繁物品集,	zh_TW
dc.subject.keyword	data mining,association rules,closed frequent itemsets,	en
dc.relation.page	53
dc.rights.note	有償授權
dc.date.accepted	2013-07-23
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	資訊工程學研究所	zh_TW
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
ntu-102-1.pdf 未授權公開取用	4.1 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。