點集合資料庫中封閉性樣式之資料探勘

Po-Yin Chen; 陳柏吟

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/9826

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	李瑞庭
dc.contributor.author	Po-Yin Chen	en
dc.contributor.author	陳柏吟	zh_TW
dc.date.accessioned	2021-05-20T20:43:42Z	-
dc.date.available	2010-07-23
dc.date.available	2021-05-20T20:43:42Z	-
dc.date.copyright	2008-07-23
dc.date.issued	2008
dc.date.submitted	2008-07-17
dc.identifier.citation	[1] R. Agrawal, R. Srikant, Fast algorithms for mining association rules, Proceedings of the International Conference on Very Large Data Bases, Santiago, Chile, 1994, pp. 487-499. [2] R. Agrawal, R. Srikant, Mining sequential patterns, Proceedings of the International Conference on Data Engineering, Taipei, Taiwan, 1995, pp. 3-14. [3] J. Ayres, J.E. Gehrke, T. Yiu, J. Flannick, Sequential pattern mining using a bitmap representation, Proceedings of the ACM SIGMOD International Conference on Knowledge Discovery in Database, Edmonton, Canada, 2002, pp. 429-435. [4] T.S. Chen, S.C. Hsu, Mining frequent tree-like patterns in large datasets, Data and Knowledge Engineering, Vol. 62, No. 1, 2007, pp. 65-83. [5] J. Cheng, Y. Ke, W. Ng, δ-Tolerance Closed Frequent Itemsets, Proceedings of the IEEE International Conference on Data Mining, Hong Kong, China, 2006, pp. 139-148. [6] J.D. Chung, O.H. Paek, J.W. Lee, K.H. Ryu, Temporal moving pattern mining for location-based service, Proceedings of the International Conference on Database and Expert Systems Applications, Aix-en-Provence, France, 2002, pp. 331-340. [7] C.I. Ezeife, M. Monwar, SSM: a frequent sequential data stream patterns miner, Proceedings of the IEEE Symposium on Computational Intelligence and Data Mining, Honolulu, Hawaii, USA, 2007, pp. 120-126. [8] F. Giannotti, M. Nanni, F. Pinelli, D. Pedreschi, Trajectory pattern mining, Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Jose, California, USA, 2007, pp. 330-339. [9] G. Grahne, J. Zhu, Fast algorithms for frequent itemset mining using FP-trees, IEEE Transactions on Knowledge and Data Engineering, Vol. 17, No. 10, 2005, pp. 1347-1362. [10] E. Gudes, E. Shimony N. Vanetik, Discovering frequent graph patterns using disjoint paths, IEEE Transactions on Knowledge and Data Engineering, Vol. 18, No. 11, 2006, pp. 1441-1456. [11] J. Han, J. Pei, Y. Yin, Mining frequent patterns without candidate generation, Proceedings of the ACM SIGMOD International Conference on Management of Data, Dallas, USA, 2000, pp. 1-12. [12] J. Han, J. Pei, B. Mortazavi-Asl, Q. Chen, U. Dayal, M.C. Hsu, FreeSpan: frequent pattern-projected sequential pattern mining, Proceedings of International Conference on Knowledge Discovery and Data Mining, Boston, USA, 2000, pp. 355-359. [13] J. Han, J. Wang, Y. Lu, P. Tzvetkov, Mining top-k frequent closed patterns without minimum support, Proceedings of the IEEE International Conference on Data Mining, Maebashi City, Japan, 2002, pp. 211-218. [14] V.T.H. Nhan, J.H. Chi, K.H. Ryu, Discovery of spatiotemporal patterns in mobile environment, Proceedings of the Asia-Pacific Web Conference, Harbin, China, 2006, pp. 949-954. [15] J. Huan, W. Wang, J. Prins, Efficient mining of frequent subgraphs in the presence of isomorphism, Proceedings of IEEE International Conference on Data Mining, Melbourne, Florida, USA, 2003, pp. 549-552. [16] S.Y. Hwang, Y.H. Liu, J.K. Chiu, E.P. Lim, Mining mobile group patterns: a trajectory-based approach, Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining, Hanoi, Vietnam, 2005, pp. 713-718. [17] A. Inokuchi, T. Washio, H.Motoda, An Apriori-based algorithm for mining frequent substructures from graph data, Proceedings of the 4th European Conference on Principles of Data Mining and Knowledge Discovery, Lyon, France, 2000, pp. 13-23. [18] R. Jin, C. Wang, D. Polshakov, S. Parthasarathy, G. Agarwal, Discovery frequent topological structures from graph datasets, Proceeding of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Chicago, USA, 2005, pp. 606-611. [19] M. Kuramochi, G. Karypis, Frequent subgraph discovery, Proceedings of IEEE International Conference on Data Mining, California, USA, 2001, pp. 313-320. [20] M. Leleu, C. Rigotti, Jean-Francois Boulicaut, G. Euvrard, GO-SPADE: mining sequential patterns over datasets with consecutive repetitions, Proceedings of International Conference on Machine Learning and Data Mining, Leipzig, Germany, 2003, pp. 293-306. [21] C. Lucchese, S. Orlando, R. Perego, Fast and memory efficient mining of frequent closed itemsets, IEEE Transactions on Knowledge and Data Engineering, Vol. 18, No. 1, 2006, pp. 21-36. [22] H.D.K. Moonesinghe, S. Fodeh, P.N. Tan, Frequent closed itemset mining using prefix graphs with an efficient flow-based pruning strategy, Proceedings of IEEE International Conference on Data Mining, Hong Kong, China, 2006, pp. 426-435. [23] G.K. Palshikar, M.S. Kale, M.M. Apte, Association rules mining using heavy itemsets, Data and Knowledge Engineering, Vol. 61, No. 1, 2007, pp. 93-113. [24] N. Pasquier, Y. Bastide, R. Taouil, L. Lakhal, Discovering frequent closed itemsets for association rules, Proceedings of the 7th International Conference on Database Theory, Jerusalem, Israel, 1999, pp. 398-416. [25] J. Pei, J. Han, R. Mao, CLOSET: an efficient algorithm for mining frequent closed itemsets, Proceedings of the 5th ACM-SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, Dallas, USA, 2000, pp. 11-20. [26] J. Pei, J. Han, B. Mortazavi-Asl, H. Pinto, PrefixSpan: mining sequential patterns efficiently by prefix-projected pattern growth, Proceedings of the 17th International Conference on Data Engineering, Heidelberg, Germany, 2001, pp. 215-224. [27] B. Rozenberg, E. Gudes, Association rules mining in vertically partitioned databases, Data and Knowledge Engineering, Vol. 59, No. 2, 2006, pp. 378-396. [28] N.G. Singh, S.R. Singh, A.K. Mahanta, CloseMiner: discovering frequent closed itemsets using frequent closed tidsets, Proceedings of IEEE International Conference on Data Mining, Houston, USA, 2005, pp. 633-636. [29] R. Srikant, R. Agrawal, Mining sequential patterns: generalizations and performance improvements, Proceedings of the 5th International Conference on Extending Database Technology, Avignon, France, 1996, pp. 3-17. [30] I. Tsoukatos, D. Gunopulos, Efficient mining of spatiotemporal patterns, Proceedings of the International Symposium on Advances in Spatial and Temporal Databases, Redondo Beach, CA, USA, 2001, pp. 425-442. [31] T. Uno, T. Asai, Y. Uchida, H. Arimura, An efficient algorithm for enumerating closed patterns in transaction databases, Proceedings of the 7th International Conference on Discovery Science, Padova, Italy, 2004, pp. 16-31. [32] J. Wang, J. Han, J. Pei, CLOSET+: searching for the best strategies for mining frequent closed itemsets, Proceedings of the International Conference on Knowledge Discovery and Data Mining, Washington, D.C., USA, 2003, pp. 236-245. [33] X. Yan, J. Han, gSpan: graph-based substructure pattern mining, Proceedings of IEEE International Conference on Data Mining, Maebashi City, Japan, 2002, pp. 721-724. [34] H. Yao, H. Hamilton, Mining itemset utilities from transaction databases, Data and Knowledge Engineering, Vol. 59, No. 3, 2007, pp. 603-626. [35] U. Yun, J.J. Leggett, WSpan: weighted sequential pattern mining in large sequence databases, Proceedings of International IEEE Conference on Intelligent Systems, London, United Kingdom, 2006, pp. 512-517. [36] M.J. Zaki, SPADE: an efficient algorithm for mining frequent sequences, Machine Learning, Vol. 42, No. 1-2, 2001, pp. 31-60. [37] M.J. Zaki, C. Hsiao, Efficient algorithms for mining closed itemsets and their lattice structure, IEEE Transactions on Knowledge and Data Engineering, Vol. 17, No. 4, 2005, pp. 462-478. [38] M.J. Zaki, K. Gouda, Fast vertical mining using diffsets, Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington, D.C., USA, 2003, pp. 326-335.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/9826	-
dc.description.abstract	隨著行動運算及定位技術的進步，位置定位服務（Location Based Services，LBS）也跟著蓬勃發展。透過定位裝置，我們可以將大量點座標資料搜集到點集合資料庫中，而點集合是由一些點座標所形成的集合。藉由資料探勘的技術，可以幫助我們在點集合資料庫中發現物體移動的經常路徑或行為。因此，這篇論文將探討如何在點集合資料庫中尋找經常出現的樣式或路徑，我們提出一個有效率的探勘演算法叫「PCP-Miner」，用來找尋點集合資料庫中的封閉性樣式。演算法主要可分為兩個階段。第一階段，我們產生出所有長度為2的頻繁樣式。第二階段，我們利用頻繁樣式樹以深先搜尋法的方式遞迴產生所有的頻繁樣式。在產生的過程中，我們會對這些樣式做檢查，檢查它們是否為封閉的。由於PCP-Miner只需掃描資料庫一次，且能避免產生不必要的候選樣式，實驗結果顯示，不管在合成資料或真實資料中，我們所提出的方法皆比改良式的Apriori演算法有效率。	zh_TW
dc.description.abstract	With advance in mobile computing and positioning technologies, location-based services (LBS) have gained significant progress. By using these technologies, a large amount of pointsets can be collected in an LBS database where a pointset contains a set of points. Mining frequent pointset in a pointset database can help us understand the movement patterns of objects. In this thesis, we proposed a novel algorithm, PCP-Miner (Pointset Closed Pattern Miner), to mine frequent closed pointset patterns. Our proposed algorithm consists of two phases. First, we find all frequent patterns of length two in the database. Second, for each pattern found in the first phase, we recursively generate frequent patterns by a frequent pattern tree in a depth-first search manner. During the process of pattern generation, we check whether the frequent patterns are closed or not. Since the PCP-Miner only needs to scan the database once and doesn’t generate unnecessary candidates, it is more efficient than the modified Apriori algorithm. The experiment results show that the PCP-Miner outperforms the modified Apriori by one order of magnitude in both synthetic and real data.	en
dc.description.provenance	Made available in DSpace on 2021-05-20T20:43:42Z (GMT). No. of bitstreams: 1 ntu-97-R95725010-1.pdf: 587963 bytes, checksum: 7c41667a8ac7bea17cab16ce40731940 (MD5) Previous issue date: 2008	en
dc.description.tableofcontents	Table of Contents.....................................i List of Figures......................................ii List of Tables......................................iii Chapter 1 Introduction................................1 Chapter 2 Problem Definition..........................7 Chapter 3 Our Proposed Method........................10 3.1. Frequent pattern enumeration....................10 3.2. The closure checking and pruning strategies.....14 3.3. The PCP-Miner algorithm.........................16 3.4. An example......................................20 Chapter 4 Performance Analysis.......................23 4.1. Synthetic datasets..............................23 4.2. Performance evaluation on synthetic datasets....24 4.3. Performance evaluation on the real dataset......27 Chapter 5 Conclusions and Future Work................32 References...........................................34
dc.language.iso	en
dc.title	點集合資料庫中封閉性樣式之資料探勘	zh_TW
dc.title	Mining Closed Patterns in Pointset Databases	en
dc.type	Thesis
dc.date.schoolyear	96-2
dc.description.degree	碩士
dc.contributor.oralexamcommittee	劉敦仁,陳彥良
dc.subject.keyword	資料探勘,點集合資料庫,封閉性樣式,	zh_TW
dc.subject.keyword	data mining,pointset databases,closed patterns,	en
dc.relation.page	38
dc.rights.note	同意授權(全球公開)
dc.date.accepted	2008-07-18
dc.contributor.author-college	管理學院	zh_TW
dc.contributor.author-dept	資訊管理學研究所	zh_TW
顯示於系所單位：	資訊管理學系

文件中的檔案：

檔案	大小	格式
ntu-97-1.pdf	574.18 kB	Adobe PDF	檢視/開啟

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。