時間序列資料庫中封閉性多序列樣式之資料探勘

Tzu-Yu Lee; 李梓煜

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/30063

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	李瑞庭
dc.contributor.author	Tzu-Yu Lee	en
dc.contributor.author	李梓煜	zh_TW
dc.date.accessioned	2021-06-13T01:33:48Z	-
dc.date.available	2008-07-20
dc.date.copyright	2007-07-20
dc.date.issued	2007
dc.date.submitted	2007-07-16
dc.identifier.citation	[1]J. Ayres, J. Gehrke, T. Yiu, and J. Flannick, Sequential pattern mining using a bitmap representation, Proc. of the eighth ACM International Conference on Knowledge Discovery and Data Mining, pp. 429-435, 2002. [2]D. J. Berndt, and J. Clifford, Finding patterns in time series: a dynamic programming approach, Advances in knowledge discovery and data mining, pp. 229-248, 1996. [3]G. Chen, X. Wu, and X. Zhu, Sequential pattern mining in multiple streams, Proc. of the Fifth IEEE International Conference on Data Mining, pp. 585-588, 2005. [4]Central Bank of Republic of China, http://www.cbc.gov.tw/, 2007. [5]C. Faloutsos, M. Ranganathan, and Y. Manolopoulos, Fast subsequence matching in time-series databases, Proc. of the ACM SIGMOD conference on Management of Data, pp. 419-429, 1994. [6]J. Han, G. Dong, and Y. Yin, Efficient mining of partial periodic patterns in time series database, Proc. of the 15th International Conference on Data Engineering, pp. 106-115, 1999. [7]J. Han, J. Wang, Y. Lu, and P. Tzvetkov, Mining top-k frequent closed patterns without minimum support, Proc. of the IEEE International Conference on Data Mining, pp. 211-218, 2002. [8]C. Lucchese, S. Orlando, and R. Perego, Fast and memory efficient mining of frequent closed itemsets, Proc. of the IEEE Transactions on Knowledge and Data Engineering, pp. 21-36, 2006. [9]J. Lin, E. Keogh, S. Lonardi, and B. Chiu, A symbolic representation of time series, with implications for streaming algorithms, Proc. of the 8th ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, pp. 2-11, 2003. [10]H. D. K. Moonesinghe, S. Fodeh, and P. N. Tan, Frequent closed itemset mining using prefix graphs with an efficient flow-based pruning strategy, Proc. of the Sixth International Conference on Data Mining, pp. 426-435, 2006. [11]J. Pei, J. Han, and R. Mao, CLOSET: an efficient algorithm for mining frequent closed itemsets, Proc. of the 5th ACM-SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, pp. 11-20, 2000. [12]J. Pei, J. Han, B. Mortazavi-Asl, and H. Pinto, PrefixSpan: mining sequential patterns efficiently by prefix-projected pattern growth, Proc. of the International Conference on Data Engineering, pp. 215-224, 2001. [13]R. Srikant, and R. Agrawal, Mining sequential patterns, Proc. of the International Conference on Data Engineering, Taipei, Taiwan, pp. 3-14, 1995. [14]R. Srikant, and R. Agrawal, Fast algorithms for mining association rules, Proc. of the 20th International Conference Very Large Data Bases, pp. 487-499, 1994. [15]R. Srikant, and R. Agrawal, Mining sequential patterns: generalizations and performance improvements, Proc. of the 5th International Conference on Extending Database Technology, pp. 3-17, 1996. [16]P. Tzvetkov, X. Yan, and J. Han, TSP: mining top-K closed sequential patterns, Proc. of the third IEEE International Conference on Data Mining, pp. 347-354, 2003. [17]J. Wang, and J. Han, BIDE: efficient mining of frequent closed sequences, Proc. of the 20th International Conference on Data Engineering, pp. 79-90, 2004. [18]J. Wang, J. Han, and J. Pei, CLOSET+: searching for the best strategies for mining frequent closed itemsets, Proc. of the 9th ACM International Conference on Knowledge Discovery and Data Mining, Washington, DC, pp. 236-245, 2003. [19]X. Yan, J. Han, and R. Afshar, CloSpan: mining closed sequential patterns in large databases, Proc. of the SIAM International Conference on Data Mining, San Francisco, CA, pp. 166-177, 2003. [20]T. Q. Yang, A time series data mining based on ARMA and MLFNN model for intrusion detection, Journal of Communication and Computer, pp. 16-22, 2006. [21]M. J. Zaki, and C. Hsiao, Efficient algorithms for mining closed itemsets and their lattice structure, Proc. of the IEEE Transactions on Knowledge and Data Engineering, pp. 462-478, 2005. [22]M. J. Zaki, SPADE: an efficient algorithm for mining frequent sequences, Machine Learning, vol. 42, pp. 31-60, 2001.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/30063	-
dc.description.abstract	序列樣式分析已經被廣泛地應用在許多領域上，且已經有許多尋找序列樣式的方法被提出。但是目前所提出的方法只考慮每筆交易只含有一條序列，並沒有考慮每筆交易含有多序列的情形，也沒有考慮到項目集合間的時間間隔，因此，這篇論文將探討尋找多序列樣式之資料探勘。在本篇論文中我們提出一個有效率的探勘演算法叫「CMP-Miner」，以找尋時間序列資料庫中封閉性多序列樣式。我們的方法可以分為三個階段。第一階段，我們將每個時間序列轉換成一個符號序列。第二階段，我們產生所有的頻繁樣式，並且在產生的過程中對這些樣式做檢查，檢查它們是否為封閉的。在第三階段，我們將重複執行第二階段的步驟直到不能找到任何的封閉性樣式為止。實驗結果顯示，我們所提出的方法具有效率與擴充性，我們的方法比Apriori演算法快達數十倍之多。	zh_TW
dc.description.abstract	There are many algorithms proposed to find sequential patterns in a sequence database. However, the sequential pattern mining algorithms proposed are not suitable for mining frequent patterns in a time-series database since they do not consider multiple sequences in a transaction and the time intervals between the itemsets in a frequent pattern. Therefore, in this thesis, we propose an efficient algorithm, called CMP-Miner, to mine closed patterns in time-series databases where each transaction contains multiple sequences. Our proposed algorithm consists of three phases. First, we transform each time-series sequence to a symbolic sequence. Second, we generate all frequent patterns and check whether the frequent patterns are closed during the process of pattern generation. The second phase is repeated until no more closed patterns can be generated. The experimental results show that our proposed algorithm is efficient and scalable, and outperforms the modified Apriori algorithm by one order of magnitude.	en
dc.description.provenance	Made available in DSpace on 2021-06-13T01:33:48Z (GMT). No. of bitstreams: 1 ntu-96-R94725026-1.pdf: 547726 bytes, checksum: af37b8feb7e3431a3d68de0f52cd9902 (MD5) Previous issue date: 2007	en
dc.description.tableofcontents	Table of Contents.........................................i List of Figures..........................................ii List of Tables..........................................iii Chapter 1 Introduction....................................1 Chapter 2 Problem Definition..............................4 Chapter 3 Our Proposed Method............................10 3.1 Frequent pattern enumeration.........................10 3.2 The closure checking method and pruning strategies...12 3.3 The CMP-Miner algorithm..............................14 Chapter 4 Performance Analysis...........................19 4.1 Experiments on synthetic data........................20 4.2 Experiments on real data.............................25 Chapter 5 Conclusions and Future Work....................28 References...............................................29
dc.language.iso	en
dc.subject	演算法	zh_TW
dc.subject	時間序列	zh_TW
dc.subject	封閉性樣式	zh_TW
dc.subject	序列樣式	zh_TW
dc.subject	資料探勘	zh_TW
dc.subject	closed pattern	en
dc.subject	algorithm	en
dc.subject	time-series sequence	en
dc.subject	sequential pattern	en
dc.subject	data mining	en
dc.title	時間序列資料庫中封閉性多序列樣式之資料探勘	zh_TW
dc.title	Mining Closed Patterns in Multi-sequence Time-series Databases	en
dc.type	Thesis
dc.date.schoolyear	95-2
dc.description.degree	碩士
dc.contributor.oralexamcommittee	諶家蘭,吳怡瑾
dc.subject.keyword	資料探勘,序列樣式,時間序列,封閉性樣式,演算法,	zh_TW
dc.subject.keyword	data mining,sequential pattern,time-series sequence,closed pattern,algorithm,	en
dc.relation.page	31
dc.rights.note	有償授權
dc.date.accepted	2007-07-17
dc.contributor.author-college	管理學院	zh_TW
dc.contributor.author-dept	資訊管理學研究所	zh_TW
顯示於系所單位：	資訊管理學系

文件中的檔案：

檔案	大小	格式
ntu-96-1.pdf 未授權公開取用	534.89 kB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。