請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/45943
標題: | 時間序列資料庫中封閉性線樣式之資料探勘 Mining Closed Line Patterns in Time-series Databases |
作者: | Tzu-Hao Tseng 曾子豪 |
指導教授: | 李瑞庭(Anthony J.T. Lee) |
關鍵字: | 線段樣式,點樣式,封閉性樣式,頻繁樣式,時間序列資料庫, line pattern,point-based pattern,closed pattern,frequent pattern,time series database, |
出版年 : | 2010 |
學位: | 碩士 |
摘要: | 近年來,時間序列資料庫以蓬勃的速度廣泛地被應用在各領域中,如:財務資料分析、網路流量分析或移動物件追蹤等等。因為一條線段可包含許多的點,以點表示的樣式長度會比以線段表示的樣式長度要大許多。因此,探勘線段樣式會比探勘點樣式更有效率。因此,在本篇論文中,我們提出一個有效率的探勘演算法叫做「CLP-Mine」,從時間序列資料庫中探勘封閉性線段樣式。我們所提出的演算法主要包括三個階段。首先,我們將資料庫中每個時間序列轉換成線段序列。然後,我們從轉換後的資料庫中,找出所有長度為一的頻繁樣式。最後,我們利用頻繁樣式樹以深先搜尋法的方式遞迴產生所有的頻繁樣式。在探勘的過程中,我們利用有效的修剪策略刪除不必要的候選樣式,並檢查所產生的樣式是否為封閉性的樣式,我們也利用映射資料庫以減少搜尋空間。實驗結果顯示,不管在合成或真實資料庫中,我們所提出的方法皆優於改良式的A-Close演算法。 Time series data has grown at a rapid speed and has been analyzed in various domains, e.g., financial data analysis, network traffic analysis, moving object tracking, etc. Since a line segment may contain many points, the length of a pattern represented by points is much larger than that by line segments. Thus, mining line-based patterns will be more efficient than mining point-based patterns. Therefore, in this thesis, we propose an efficient algorithm, called CLP-Mine (Closed Line Patterns Mining), to find closed line patterns in time-series databases. The proposed algorithm consists of three phases. First, we transform each sequence in the original time-series database into a sequence of line segments. Second, we find all frequent patterns of length one in the transformed database. Third, for each pattern found in the second phase, we recursively generate frequent patterns by a frequent pattern tree in a depth-first search manner. During the mining process, we employ several effective pruning strategies to eliminate impossible candidates and a closure checking scheme to remove non-closed patterns. Moreover, the CLP-Mine utilizes projected databases to localize the support counting and pattern generation. The experimental results show that the proposed method is more efficient and scalable than the modified A-Close algorithm in both synthetic and real datasets. |
URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/45943 |
全文授權: | 有償授權 |
顯示於系所單位: | 資訊管理學系 |
文件中的檔案:
檔案 | 大小 | 格式 | |
---|---|---|---|
ntu-99-1.pdf 目前未授權公開取用 | 654.78 kB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。