Please use this identifier to cite or link to this item:
Mining Closed Patterns in Time-Series Databases
Data mining,Closed pattern,Time-series database,
|Publication Year :||2010|
|Abstract:||近年來，探勘封閉性樣式已成為知識探索領域中重要的研究議題，其主要目的在於找出隱藏在大量資料中具有代表性的樣式。本論文針對如何從時間序列資料庫中探勘封閉性樣式，提出了三個有效率的演算法，分別為CMP (Closed Multi-sequence Patterns mining)、 CFP (Closed Flexible Patterns mining)與CNP (multi-resolution Closed Numerical Patterns mining)。
本研究結果顯示CMP演算法比改良式Apriori以及BIDE演算法快了數十倍; CFP演算法比改良式Apriori演算法更有效率; CNP演算法執行速度亦較改良式A-Close演算法快。
Closed pattern mining is a critical research issue in the area of knowledge discovery and data mining with the aim of discovering interesting patterns hidden in a large amount of data. In this dissertation, we propose three algorithms, called CMP (Closed Multi-sequence Patterns mining), CFP (Closed Flexible Patterns mining), and CNP (multi-resolution Closed Numerical Patterns mining) to solve various issues extended from the problem of mining closed patterns.
The CMP algorithm is designed to find closed patterns in a multi-sequence time-series database. The CFP algorithm is developed to solve the problem of mining closed flexible patterns in a time-series database. Both the CMP and CFP algorithms involve a transformation of time-series sequences into symbolic sequences in the first phase. Although analyzing on symbolic sequences is ideal to reduce the effect of noises and ease the mining process, these approaches may lead to pattern lost and the sequences supporting the same pattern may look quite different.
To overcome the problem raised in symbolic sequence analysis, the CNP algorithm is proposed to mine closed patterns without any transformation from time-series sequences to symbolic sequences. The method also employs the Haar wavelet transform to discover patterns in the multiple resolutions in order to provide different perspectives on datasets.
All the proposed algorithms have employed the concept of projected databases to localize the pattern extension that leads to a significant runtime improvement. Moreover, effective closure checking schemes and pruning strategies are devised respectively in each of the proposed algorithms to avoid generating redundant candidates.
The experimental results show that the CMP algorithm significantly outperforms the modified Apriori and BIDE algorithms. The CFP algorithm achieves better performance than the modified Apriori algorithm in all cases. And, the CNP algorithm has demonstrated a significant runtime improvement in comparison to the modified A-Close algorithm.
|Appears in Collections:||資訊管理學系|
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.