Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 管理學院
  3. 資訊管理學系
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/8755
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor李瑞庭
dc.contributor.authorHuei-Wen Wuen
dc.contributor.author吳惠雯zh_TW
dc.date.accessioned2021-05-20T20:00:42Z-
dc.date.available2013-02-04
dc.date.available2021-05-20T20:00:42Z-
dc.date.copyright2010-02-04
dc.date.issued2010
dc.date.submitted2010-01-28
dc.identifier.citation[1] R. Agrawal, K. Lin, H. S. Sawhney, K. Shim, Fast similarity search in the presence of noise, scaling, and translation in time-series databases, in: Proceedings of the 21th International Conference on Very Large Data Bases, 1995, pp. 490-501.
[2] C. D. Ahrens, Meteorology today: an introduction to weather, climate, and the environment (8th ed.), Thomson Brooks/Cole, Belmont, 2007.
[3] D. Alter, Liver-function testing, MLO: Medical Laboratory Observer 40 (12) (2008) 10-17.
[4] J. Ayres, J. Gehrke, T. Yiu, J. Flannick, Sequential pattern mining using a bitmap representation, in: Proceedings of the 8th ACM International Conference on Knowledge Discovery and Data Mining, 2002, pp. 429-435.
[5] BBC News, <http://news.bbc.co.uk/2/hi/business/7073131.stm>.
[6] D. J. Berndt, J. Clifford, Finding patterns in time series: a dynamic programming approach, Advances in Knowledge Discovery and Data Mining (1st ed.), American Association for Artificial Intelligence, 1996, pp. 229-248.
[7] Central Weather Bureau, <http://www.cwb.gov.tw/>.
[8] L. Chang, T. Wang, D. Yang, H. Luan, SeqStream: mining closed sequential patterns over stream sliding windows, in: Proceedings of the 8th IEEE International Conference on Data Mining, 2008, pp. 83-92.
[9] H. Chen, W. Chung, J. J. Xu, G. Wang, Y. Qin, M. Chau, Crime data mining: a general framework and some examples, IEEE Computer 37 (4) (2004) 50-56.
[10] T. S. Chen, S. C. Hsu, Mining frequent tree-like patterns in large datasets, Data and Knowledge Engineering 62 (1) (2007) 65-83.
[11] Y. L. Chen, T. C. K. Huang, A novel knowledge discovering model for mining fuzzy multi-level sequential patterns in sequence databases, Data and Knowledge Engineering 66 (3) (2008) 349-367.
[12] Y. Chen, S. Mabu, K. Shimada, and K. Hirasawa, A genetic network programming with learning approach for enhanced stock trading model, Expert Systems with Applications 36 (10) (2009) 12537-12546.
[13] C. J. Chu, V. S. Tseng, T. Liang, Efficient mining of temporal emerging itemsets from data streams, Expert Systems with Applications 36 (1) (2009) 885-893.
[14] T. H. Cormen, C. E. Leiserson, R. L. Rivest, C. Stein, Introduction to algorithms (2nd ed.), The MIT Press, Cambridge, 2003.
[15] G. Das, K. Lin, H. Mannila, Rule discovery from time series, in: Proceedings of the 4th International Conference on Knowledge Discovery and Data Mining, 1998, pp. 16-22.
[16] Data Bank for Atmospheric Research, <http://dbar.as.ntu.edu.tw/>.
[17] C. Faloutsos, M. Ranganathan, Y. Manolopoulos, Fast subsequence matching in time-series databases, ACM SIGMOD Record 23 (2) (1994) 419-429.
[18] J. Han, G. Dong, Y. Yin, Efficient mining of partial periodic patterns in time series database, in: Proceedings of the 15th International Conference on Data Engineering, 1999, pp. 106-115.
[19] J. Han, M. Kamber, Data mining: concepts and techniques (2nd ed.), Morgan Kaufmann, San Francisco, 2006.
[20] J. Han, J. Pei, B. Mortazavi-Asl, Q. Chen, U. Dayal, M. Hsu, FreeSpan: frequent pattern-projected sequential pattern mining, in: Proceedings of the 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2000, pp. 355-359.
[21] J. Han, J. Wang, Y. Lu, P. Tzvetkov, Mining top-k frequent closed patterns without minimum support, in: Proceedings of the 2002 IEEE International Conference on Data Mining, 2002, pp. 211-218.
[22] J. W. Huang, C. Y. Tseng, J. C. Ou, M. S. Chen, A general model for sequential pattern mining with a progressive database, IEEE Transactions on Knowledge and Data Engineering 20 (9) (2008) 1153-1167.
[23] Y. Huang, L. Zhang, P. Zhang, A framework for mining sequential patterns from spatio-temporal event data sets, IEEE Transactions on Knowledge and Data Engineering 20 (4) (2008) 433-448.
[24] L. Ji, K. L. Tan, K. H. Tung, Compressed hierarchical mining of frequent closed patterns from dense data sets, IEEE Transactions on Knowledge and Data Engineering 19 (9) (2007) 1175-1187.
[25] E. Keogh, Fast similarity search in the presence of longitudinal scaling in time series database, in: Proceedings of the Ninth International Conference on Tools with Artificial Intelligence, 1997, pp. 578-584.
[26] E. Keogh, S. Kasetty, On the need for time series data mining benchmarks: a survey and empirical demonstration, in: Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2002, pp. 102-111.
[27] C. Kim, J. Lim, R. T. Ng, K. Shim, SQUIRE: sequential pattern mining with quantities, The Journal of Systems and Software 80 (10) (2007) 1726-1745.
[28] M. Kontaki, A. N. Papadopoulos, Y. Manolopoulos, Adaptive similarity search in streaming time series with sliding windows, Data and Knowledge Engineering 63 (2) (2007) 478-502.
[29] R. J. Larsen, M. L. Marx, An introduction to mathematical statistics and its applications (3rd ed.), Prentice Hall, New Jersey, 2001.
[30] A. J. T. Lee, C. S. Wang, W. Y. Wang, Y. A. C, H. W. Wu, An efficient algorithm for mining closed inter-transaction itemsets, Data and Knowledge Engineering 66 (1) (2008) 68-91.
[31] A. J. T. Lee, Y. T. Wang, Efficient data mining for calling path patterns in GSM networks, Information Systems 28 (8) (2003) 929-948.
[32] A. J. T. Lee, H. W. Wu, T. Y. Lee, Y. H. Liu, K. T. Chen, Mining closed patterns in multi-sequence time-series databases, Data and Knowledge Engineering 68 (10) (2009) 1071-1090.
[33] C. H. L. Lee, A. Liu, W. S. Chen, Pattern discovery of fuzzy time-series for financial prediction, IEEE Transactions on Knowledge and Data Engineering 18 (5) (2006) 613-625.
[34] T. H. Lee, R. Kim, J. T. Benson, T. M. Therneau, L. J. Melton III, Serum aminotransferase activity and mortality risk in a United States community, Hepatology 47 (3) (2008) 880-887.
[35] Y. S. Lee, S. J. Yen, Incremental and interactive mining of web traversal patterns, Information Sciences 178 (2) (2008) 287-306.
[36] H. F. Li, C. C. Ho, S. Y. Lee, Incremental updates of closed frequent itemsets over continuous data streams, Expert Systems with Applications 36 (2) (2009) 2451-2458.
[37] J. Lin, E. Keogh, S. Lonardi, B. Chiu, A symbolic representation of time series, with implications for streaming algorithms, in: Proceedings of the 8th ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, 2003, pp. 2-11.
[38] M. Y. Lin, S. C. Hsueh, C. W. Chang, Fast discovery of sequential patterns in large databases using effective time-indexing, Information Sciences 178 (22) (2008) 4228-4245.
[39] F. Masseglia, P. Poncelet, M. Teisseire, Efficient mining of sequential patterns with time constraints: reducing the combinations, Expert Systems with Applications 36 (2) (2009) 2677-2690.
[40] F. Masseglia, P. Poncelet, M. Teisseire, Incremental mining of sequential patterns in large databases, Data and Knowledge Engineering 46 (1) (2003) 97-121.
[41] F. M&ouml;rchen, A. Ultsch, Optimizing time series discretization for knowledge discovery, in: Proceedings of the 11th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2005, pp. 660-665.
[42] Y. Nishi, R. Doering, Handbook of semiconductor manufacturing technology (1st ed.), Marcel Dekker Inc., New York, 2000.
[43] N. Pasquier, Y. Bastide, R. Taouil, L. Lakhal, Discovering frequent closed itemsets for association rules, in: Proceeding of the 7th International Conference on Database Theory, 1999, pp. 398-416.
[44] J. Pei, J. Han, R. Mao, CLOSET: an efficient algorithm for mining frequent closed itemsets, in: Proceedings of the 5th ACM-SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, 2000, pp. 21-30.
[45] J. Pei, J. Han, B. Mortazavi-Asl, H. Pinto, PrefixSpan: mining sequential patterns efficiently by prefix-projected pattern growth, in: Proceedings of the 17th International Conference on Data Engineering, 2001, pp. 215-224.
[46] W. C. Peng, Z. X. Liao, Mining sequential patterns across multiple sequence databases, Data and Knowledge Engineering 68 (10) (2009) 1014-1033.
[47] D. Perera, J. Kay, I. Koprinska, K. Yacef, O. R. Zaiane, Clustering and sequential pattern mining of online collaborative learning data, IEEE Transactions on Knowledge and Data Engineering 21 (6) (2009) 759-772.
[48] P. J. Pockros, E. R. Schiff, M. L. Shiffman, J. G. McHutchison, R. G. Gish, N. H. Afdhal, M. Makhviladze, M. Huyghe, D. Hecht, T. Oltersdorf, D. A. Shapiro, Oral IDN-6556, an antiapoptotic caspase inhibitor, may lower aminotransferase activity in patients with chronic hepatitis C, Hepatology 46 (2) (2007) 324-329.
[49] A. H. Ritchie, D. M. Williscroft, Elevated liver enzymes as a predictor of liver injury in stable blunt abdominal trauma patients: case report and systematic review of the literature, Canadian Journal of Rural Medicine 11 (4) (2006) 283-287.
[50] S. Russell, A. Gangopadhyay, V. Yoon, Assisting decision making in the event-driven enterprise using wavelets, Decision Support Systems 46 (1) (2008) 14-28.
[51] S. R. Song, W. Y. Ku, Y. L. Chen, Y. C. Lin, C. M. Liu, L. W. Kuo, T. F. Yang, H. J. Lo, Groundwater chemical anomaly before and after the Chi-Chi Earthquake in Taiwan, Terrestrial, Atmospheric and Oceanic Sciences 14 (3) (2003) 311-320.
[52] R. Srikant, R. Agrawal, Mining sequential patterns, in: Proceedings of the 11th International Conference on Data Engineering, 1995, pp. 3-14.
[53] R. Srikant, R. Agrawal, Fast algorithms for mining association rules, in: Proceedings of the 20th International Conference Very Large Data Bases, 1994, pp. 487-499.
[54] R. Srikant, R. Agrawal, Mining sequential patterns: generalizations and performance improvements, in: Proceedings of the 5th International Conference on Extending Database Technology, 1996, pp. 3-17.
[55] Standard and Poor's, <http://www2.standardandpoors.com>.
[56] Stocks on Wall Street, <http://stocksonwallstreet.net/2009/07/31/golden-cross-shows-bullish-technical-indicator/>.
[57] Taiwan Stock Exchange Corporation, <http://www.tse.com.tw/ch/index.php>.
[58] J. I. Takeuchi, K. Yamanishi, A unifying framework for detecting outliers and change points from time series, IEEE Transactions on Knowledge and Data Engineering 18 (4) (2006) 482-492.
[59] H. J. Teoh, C. H. Cheng, H. H. Chu, J. S. Chen, Fuzzy time series model based on probabilistic approach and rough set rule induction for empirical research in stock markets, Data and Knowledge Engineering 67 (1) (2008) 103-117.
[60] C. S. Wang, A. J. T. Lee, Mining inter-sequence patterns, Expert Systems with Applications 36 (4) (2009) 8649-8658.
[61] J. Wang, J. Han, BIDE: efficient mining of frequent closed sequences, in: Proceedings of the 20th International Conference on Data Engineering, 2004, pp. 79-90.
[62] J. Wang, J. Han, J. Pei, CLOSET+: searching for the best strategies for mining frequent closed itemsets, in: Proceedings of the 9th ACM International Conference on Knowledge Discovery and Data Mining, 2003, pp. 236-245.
[63] Y. Wang, E. P. Lim, S. Y. Hwang, Efficient mining of group patterns from user movement data, Data and Knowledge Engineering 57 (3) (2006) 240-282.
[64] H. W. Wu, A. J. T. Lee, Mining closed flexible patterns in time-series databases, Expert Systems with Applications 37 (3) (2010) 2098-2107.
[65] Yahoo Finance, <http://finance.yahoo.com>.
[66] X. Yan, J. Han, R. Afshar, CloSpan: mining closed sequential patterns in large databases, in: Proceedings of the 2003 SIAM International Conference on Data Mining, 2003, pp. 166-177.
[67] T. Q. Yang, A time series data mining based on ARMA and MLFNN model for intrusion detection, Journal of Communication and Computer 3 (7) (2006) 16-22.
[68] D. Yuan, K. Lee, H. Cheng, G. Krishna, Z. Li, X. Ma, Y. Zhou, J. Han, CISpan: comprehensive incremental mining algorithms of closed sequential patterns for multi-versional software mining, in: Proceedings of the 2008 SIAM International Conference on Data Mining, 2008, pp. 84-95.
[69] M. J. Zaki, SPADE: an efficient algorithm for mining frequent sequences, Machine Learning 42 (1) (2001) 31-60.
[70] M. J. Zaki, C. Hsiao, Efficient algorithms for mining closed itemsets and their lattice structure, IEEE Transactions on Knowledge and Data Engineering 17 (4) (2005) 462-478.
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/8755-
dc.description.abstract近年來,探勘封閉性樣式已成為知識探索領域中重要的研究議題,其主要目的在於找出隱藏在大量資料中具有代表性的樣式。本論文針對如何從時間序列資料庫中探勘封閉性樣式,提出了三個有效率的演算法,分別為CMP (Closed Multi-sequence Patterns mining)、 CFP (Closed Flexible Patterns mining)與CNP (multi-resolution Closed Numerical Patterns mining)。
CMP演算法主要著重於多時間序列資料庫中樣式的分析,而CFP演算法可從時間序列資料庫中探勘具「彈性間隔」的封閉性樣式。CMP與CFP演算法先將時間序列轉換成符號序列,然後再探勘封閉性樣式。將時間序列轉換成符號序列可減少雜訊並簡化探勘程序,然而可能導致樣式遺失或差異很大的序列卻支持相同樣式的問題。
為避免將時間序列轉換到符號序列所造成的問題,CNP演算法可從多時間序列資料庫中直接找出具有代表性的樣式。此外,CNP演算法加入多層解析度 (multi-resolution)的概念以找出封閉性樣式,可讓使用者以不同的觀點來檢視資料。
上述三個演算法皆以深度優先探勘的方式,並配合投影資料庫以減少搜尋空間,除了加速探勘的過程外,透過有效的修剪策略及檢查封閉性的機制,可避免產生不必要的候選樣式。
本研究結果顯示CMP演算法比改良式Apriori以及BIDE演算法快了數十倍; CFP演算法比改良式Apriori演算法更有效率; CNP演算法執行速度亦較改良式A-Close演算法快。
zh_TW
dc.description.abstractClosed pattern mining is a critical research issue in the area of knowledge discovery and data mining with the aim of discovering interesting patterns hidden in a large amount of data. In this dissertation, we propose three algorithms, called CMP (Closed Multi-sequence Patterns mining), CFP (Closed Flexible Patterns mining), and CNP (multi-resolution Closed Numerical Patterns mining) to solve various issues extended from the problem of mining closed patterns.
The CMP algorithm is designed to find closed patterns in a multi-sequence time-series database. The CFP algorithm is developed to solve the problem of mining closed flexible patterns in a time-series database. Both the CMP and CFP algorithms involve a transformation of time-series sequences into symbolic sequences in the first phase. Although analyzing on symbolic sequences is ideal to reduce the effect of noises and ease the mining process, these approaches may lead to pattern lost and the sequences supporting the same pattern may look quite different.
To overcome the problem raised in symbolic sequence analysis, the CNP algorithm is proposed to mine closed patterns without any transformation from time-series sequences to symbolic sequences. The method also employs the Haar wavelet transform to discover patterns in the multiple resolutions in order to provide different perspectives on datasets.
All the proposed algorithms have employed the concept of projected databases to localize the pattern extension that leads to a significant runtime improvement. Moreover, effective closure checking schemes and pruning strategies are devised respectively in each of the proposed algorithms to avoid generating redundant candidates.
The experimental results show that the CMP algorithm significantly outperforms the modified Apriori and BIDE algorithms. The CFP algorithm achieves better performance than the modified Apriori algorithm in all cases. And, the CNP algorithm has demonstrated a significant runtime improvement in comparison to the modified A-Close algorithm.
en
dc.description.provenanceMade available in DSpace on 2021-05-20T20:00:42Z (GMT). No. of bitstreams: 1
ntu-99-D95725007-1.pdf: 1045780 bytes, checksum: fa4a4f8f9278bad7b959f6d7e12838a5 (MD5)
Previous issue date: 2010
en
dc.description.tableofcontents論文摘要 i
Dissertation Abstract ii
Table of Contents iv
List of Figures vi
List of Tables ix
Chapter 1 Introduction 1
1.1 Motivations 2
1.2 Contributions 10
1.3 Dissertation layout 11
Chapter 2 Literature Review 12
2.1 Time-series mining 12
2.2 Sequential pattern mining 12
2.3 Closed pattern mining 14
2.4 SAX representation 16
2.5 Discussion 18
Chapter 3 Mining Closed Patterns in Multi-Sequence
Time-Series Databases 20
3.1 Preliminaries and problem definitions 20
3.2 Frequent pattern tree 24
3.3 Closure checking and pruning strategies 25
3.4 The CMP algorithm 27
3.5 An example 31
Chapter 4 Mining Closed Flexible Patterns in Time-Series Databases 34
4.1 Preliminaries and problem definitions 34
4.2 Sequence tree 36
4.3 Support counting 37
4.4 Closure checking and pruning strategies 38
4.5 The CFP algorithm 41
4.6 An example 46
Chapter 5 Mining Multi-Resolution Closed Numerical Patterns in Multi-Sequence Time-Series Databases 48
5.1 Preliminaries and problem definitions 48
5.2 Haar wavelet transform 51
5.3 Numerical pattern tree 54
5.4 Comparison of projected databases 55
5.5 Pruning strategies 56
5.6 Frequent 1-pattern generation 58
5.7 The CNP algorithm 60
5.8 Mining multi-resolution patterns 68
5.9 An example 69
Chapter 6 Performance Evaluation 72
6.1 Synthetic data 72
6.2 Performance evaluation of the CMP algorithm 74
6.2.1 Evaluations on synthetic data 74
6.2.2 Evaluations on real data 79
6.3 Performance evaluation of the CFP algorithm 85
6.3.1 Evaluations on synthetic data 86
6.3.2 Evaluations on real data 89
6.4 Performance evaluation of the CNP algorithm 93
6.4.1 Evaluations on synthetic data 93
6.4.2 Evaluations on real data 97
Chapter 7 Conclusions and Future Work 104
References 108
dc.language.isoen
dc.title時間序列資料庫之封閉性樣式探勘zh_TW
dc.titleMining Closed Patterns in Time-Series Databasesen
dc.typeThesis
dc.date.schoolyear98-1
dc.description.degree博士
dc.contributor.oralexamcommittee魏志平,諶家蘭,呂永和,陳建錦
dc.subject.keyword資料探勘,封閉性樣式,時間序列資料庫,zh_TW
dc.subject.keywordData mining,Closed pattern,Time-series database,en
dc.relation.page114
dc.rights.note同意授權(全球公開)
dc.date.accepted2010-01-28
dc.contributor.author-college管理學院zh_TW
dc.contributor.author-dept資訊管理學研究所zh_TW
顯示於系所單位:資訊管理學系

文件中的檔案:
檔案 大小格式 
ntu-99-1.pdf1.02 MBAdobe PDF檢視/開啟
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved