空間資料庫中封閉性數值樣式之資料探勘

Shin-Ling Lee; 李欣陵

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/47245

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	李瑞庭(Anthony J. T. Lee)
dc.contributor.author	Shin-Ling Lee	en
dc.contributor.author	李欣陵	zh_TW
dc.date.accessioned	2021-06-15T05:52:08Z	-
dc.date.available	2013-08-19
dc.date.copyright	2010-08-19
dc.date.issued	2010
dc.date.submitted	2010-08-18
dc.identifier.citation	[1] R. Agarwal, C.C Aggarwal, V.V.V. Prasad, A tree projection algorithm for generation of frequent itemsets, Journal of Parallel Distributed Computing , Vol. 61, No. 3, 2001, pp. 350-371. [2] J. Ayres, J.E. Gehrke, T. Yiu, J. Flannick, Sequential pattern mining using a bitmap representation, Proceedings of the ACM SIGMOD International Conference on Knowledge Discovery in Database, Edmonton, Canada, 2002, pp. 429-435. [3] R. Agrawal, R. Srikant, Fast algorithms for mining association rules, Proceedings of the International Conference on Very Large Data Bases, Santiago, Chile, 1994, pp. 487-499. [4] R. Agrawal, R. Srikant, Mining sequential patterns, Proceedings of the International Conference on Data Engineering, Taipei, Taiwan, 1995, pp. 3-14. [5] D Burdick, M Calimlim, J Flannick, MAFIA: A maximal frequent itemset algorithm, IEEE Transactions on Knowledge and Data Engineering, Vol. 17, No. 11, 2005, pp. 1490-1504. [6] F. Giannotti, M Nanni, D Pedreschi, Efficient mining of temporally annotated sequences, Proceedings of the Sixth SIAM International Conference on Data Mining, 2006, pp. 346-357. [7] J. Han, H. Cheng, D. Xin , X. Yan, Frequent pattern mining: Current status and future directions, Data Mining and Knowledge Discovery, Vol. 15, No. 1, 2007, pp. 55-86. [8] J. Han, M. Kamber, Data Mining: Concepts and Techniques, 2nd ed., Morgan Kaufmann Publishers, San Francisco, CA, 2006. [9] J. Han, J. Pei, B. Mortazavi-Asl, Q. Chen, U. Dayal, M.C. Hsu, FreeSpan: Frequent pattern-projected sequential pattern mining, Proceedings of International Conference on Knowledge Discovery and Data Mining, Boston, USA, 2000, pp. 355-359. [10] J. Han, J. Pei, Y. Yin, Mining frequent patterns without candidate generation, Proceedings of the ACM SIGMOD International Conference on Management of Data, Dallas, USA, 2000, pp. 1-12. [11] W. Hsu, J. Zhang, M.L. Lee, Image mining: Issues, frameworks and techniques, Proceedings of the 2nd International Workshop on multimedia data Mining, San Francisco, CA, 2001, pp. 13-20. [12] W. Hsu, M.L. Lee, J. Zhang, Image mining: Trends and developments, Journal of Intelligent Information Systems, Vol.19, No. 1, 2002, pp. 7-23 [13] W Hsu, J Dai, ML Lee, Mining viewpoint patterns in image databases, Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data mining, 2003, pp. 553 - 558. [14] E. Keogh, J. Lin, A. Fu, HOT SAX: Efficiently finding the most unusual time series subsequence, Proceedings of the Fifth IEEE International Conference on Data Mining, 2005, pp. 226-233. [15] A. J.T. Lee, Y.A. Chen, W.C. Ip, Mining Frequent trajectory patterns in spatial-temporal databases, Information Sciences, Vol. 179, No. 13, 2009, pp. 2218-2231. [16] A. J.T. Lee, R.W. Hong, W.M. Ko, W.K. Tsao, H.H. Lin, Mining spatial association rules in image databases, Information Sciences, 2007, Vol. 177, No. 7, pp. 1593-1608 [17] J. Lin, E. Keogh, Group SAX: Extending the notion of contrast sets to time series and multimedia data, Proceedings of 10th European Conference on Principles and Practice of Knowledge Discovery in Databases, Berlin, Germany, 2006, pp. 284-296. [18] M. Leleu, C. Rigotti, J. Boulicaut, G. Euvrard, GO-SPADE: Mining sequential patterns over datasets with consecutive repetitions, Proceedings of International Conference on Machine Learning and Data Mining, Leipzig, Germany, 2003, pp. 293-306. [19] J. Lin, E. Keogh, S. Lonardi and B. Chiu, A symbolic representation of time series, with implications for streaming algorithms, Proceedings of the 8th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery, San Diego, CA, 2003, pp. 2-11. [20] K. G. Mohammed J. Zaki, GenMax: An efficient algorithm for mining maximal frequent itemsets, Data Mining and Knowledge Discovery, Vol. 11, No. 3, 2005, pp. 223 - 242. [21] A. Monreale, F. Pinelli, R. Trasarti, F. Giannotti, WhereNext: A location predictor on trajectory pattern mining, Proceedings of the 15th International Conference on Knowledge Discovery and Data Mining, 2009, pp. 637-646. [22] N. Pasquier, Y. Bastide, R. Taouil, L. Lakhal, Discovering frequent closed itemsets for association rules, Proceedings of the 7th International Conference on Database Theory, Jerusalem, Israel, 1999, pp. 398-416. [23] J. Pei, J. Han, R. Mao, CLOSET: An efficient algorithm for mining frequent closed itemsets, Proceedings of the 5th ACM-SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, Dallas, USA, 2000, pp. 11-20. [24] J. Pei, J. Han, B. Mortazavi-Asl, H. Pinto, PrefixSpan: Mining sequential patterns efficiently by prefix-projected pattern growth, Proceedings of the 17th International Conference on Data Engineering, Heidelberg, Germany, 2001, pp. 215-224. [25] R. Srikant, R. Agrawal, Mining sequential patterns: Generalizations and performance improvements, Proceedings of the 5th International Conference on Extending Database Technology, Avignon, France, 1996, pp. 3-17. [26] J. Wang, J. Han, C. Li, Frequent closed sequence mining without candidate maintenance, IEEE Transactions on Knowledge and Data Engineering, Vol. 19, No. 8, 2007, pp. 1042-1056. [27] J. Wang, J. Han, J. Pei, CLOSET+: Searching for the best strategies for mining frequent closed itemsets, Proceedings of the International Conference on Knowledge Discovery and Data Mining, Washington, D.C., USA, 2003, pp. 236-245. [28] J. Shieh and E. Keogh, iSAX: Indexing and mining terabyte sized time series, Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data mining, 2008, pp. 623-631. [29] X. Yan, J. Han, R. Afshar, CloSpan: Mining closed sequential patterns in large datasets, Proceedings of SIAM International Conference on Data Mining, 2003, pp. 166-177. [30] M.J. Zaki, SPADE: An efficient algorithm for mining frequent sequences, Machine Learning, Vol. 42, No. 1-2, 2001, pp. 31-60. [31] M.J. Zaki, K. Gouda, Fast vertical mining using diffsets, Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington, D.C., USA, 2003, pp. 326-335. [32] M.J. Zaki, C. Hsiao, Efficient algorithms for mining closed itemsets and their lattice structure, IEEE Transactions on Knowledge and Data Engineering, Vol. 17, No. 4, 2005, pp. 462-478. [33] J. Zhang, W. Hsu, M. L. Lee, An information-driven framework for image mining, Proceedings of 12th International Conference on Database and Expert Systems Applications, Germany, 2001, pp. 232-242.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/47245	-
dc.description.abstract	隨著定位科技的日益普及，我們可以蒐集到大量的空間資料。因此，如何從空間資料庫中探勘出有意義的頻繁空間樣式，成為越來越熱門的研究議題。藉由資料探勘的技術，可以幫助我們在空間資料庫中發現封閉性的數值樣式，找出不同區域的之間的量化關係，進而了解或預測市場的趨勢。因此，在這篇論文中，我們提出一個有效率的探勘演算法叫「CNP-Mine」，藉以挖掘出在空間資料庫中封閉性數值樣式。CNP-Mine演算法主要可分為兩個階段。第一階段，我們產生出所有長度為1的頻繁樣式。在第二階段，我們以深度優先搜尋法的方式遞迴產生所有的頻繁樣式。在列舉的過程中，我們利用修剪策略刪除不必要的候選樣式。同時我們會對這些樣式做檢查，檢查它們是否為封閉性的樣式。由於CNP-Mine只需掃描投影資料庫，且能避免產生不必要的候選樣式，實驗結果顯示，我們所提出的方法比改良式的A-Close演算法，在執行速度與記憶體使用量上都有較佳的表現。	zh_TW
dc.description.abstract	With advance in positioning technology, a large amount of spatial data has been collected into databases. How to mine frequent spatial pattern has attracted more and more attention recently. Mining numerical patterns in spatial databases can help us identify the quantification relationships between different locations to understand or predict the trends of markets. Therefore, in this thesis, we propose a novel algorithm, CNP-Mine (Closed Numerical Pattern Mining), to mine the closed numerical patterns in a spatial database. The proposed algorithm consists of two phases. First, we find all frequent patterns of length one (1-patterns) in the database and generate their projected databases for each frequent 1-pattern found. Next, we use a frequent spatial pattern tree to recursively generate frequent patterns in a DFS manner until no more frequent closed patterns can be found. During the mining process, we employ several effective pruning strategies to prune unnecessary candidates and a closure checking scheme to remove non-closed patterns. Moreover, we localize the support counting and pattern joins in projected databases. Thus, the proposed method can efficiently mine closed numerical patterns in a spatial database. The experimental results show that the CNP-Mine algorithm outperforms the modified A-Close algorithm in several orders of magnitude.	en
dc.description.provenance	Made available in DSpace on 2021-06-15T05:52:08Z (GMT). No. of bitstreams: 1 ntu-99-R97725002-1.pdf: 610148 bytes, checksum: ce5bbcc33a70a0c157c291b239728694 (MD5) Previous issue date: 2010	en
dc.description.tableofcontents	Table of Contents i List of Figures ii List of Tables iii Chapter 1 Introduction 1 Chapter 2 Literature Review 4 Chapter 3 Preliminary Concepts and Problem Definitions 7 Chapter 4 The Proposed Method 10 4.1. Pattern Enumeration 10 4.1.1 Generating frequent 1-patterns 10 4.1.2 Generating frequent k-patterns 12 4.2. Pruning Strategy and Closure Checking 13 4.3 The CNP-Mine Algorithm 14 4.4 An Example 18 Chapter 5 Performance Analysis 21 5.1 Synthetic datasets 21 5.2 Performance evaluation on synthetic datasets 22 5.3 Performance evaluation on real datasets 28 Chapter 6 Conclusions and Future Work 30 References 31
dc.language.iso	en
dc.title	空間資料庫中封閉性數值樣式之資料探勘	zh_TW
dc.title	Mining Closed Numerical Patterns in Spatial Databases	en
dc.type	Thesis
dc.date.schoolyear	98-2
dc.description.degree	碩士
dc.contributor.oralexamcommittee	陳彥良(Yen-Liang Chen),諶家蘭(Jia-Lang 33.Seng)
dc.subject.keyword	數值樣式,頻繁樣式,封閉性樣式,空間資料庫,資料探勘,	zh_TW
dc.subject.keyword	numerical patterns,frequent patterns,closed patterns,spatial databases,data mining,	en
dc.relation.page	34
dc.rights.note	有償授權
dc.date.accepted	2010-08-18
dc.contributor.author-college	管理學院	zh_TW
dc.contributor.author-dept	資訊管理學研究所	zh_TW
顯示於系所單位：	資訊管理學系

文件中的檔案：

檔案	大小	格式
ntu-99-1.pdf 目前未授權公開取用	595.85 kB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。