Skip navigation

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets

Learn More
DSpace logo
English
中文
  • Browse
    • Communities
      & Collections
    • Publication Year
    • Author
    • Title
    • Subject
    • Advisor
  • Search TDR
  • Rights Q&A
    • My Page
    • Receive email
      updates
    • Edit Profile
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 電機工程學系
Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/72309
Title: 於異質性平台上之高度平行化序列模式探勘
Highly Parallel Sequential Pattern Mining on a Heterogeneous Platform
Authors: Yu-Heng Hsieh
謝友恆
Advisor: 陳銘憲
Keyword: 資料探勘,序列模式探勘,通用圖形處理器,異質性平台,平行化計算,高效能計算,
Data Mining,Sequential Pattern Mining,GPGPU,Heterogeneous Platform,Parallel Computing,High Performance Computing,
Publication Year : 2018
Degree: 碩士
Abstract: 序列模式探勘可以被應用在多個領域上,例如疾病偵測及股票價格預測。雖然已經有許多演算法與加速方法被提出來解決序列模式探勘的問題,但考量支持度計算簡單又重複的特性,我們認為由中央處理器以及圖形處理器組成的異質性平台相較於傳統由中央處理器為主的機器更適合用來探勘序列模式。因此,本篇論文提出於異質性平台上之高度平行序列模式探勘演算法 (PASTA),嘗試結合圖形處理器以及中央處理器的優點以加速序列模式探勘。我們提出的PASTA採用垂直位圖資料表現方式,並用優化的前沿展開演算法探索搜尋空間。此外,我們也將演算法流程管線化以確保中央處理器與圖形處理器能同時工作而進一步提升演算法效率並設計記憶體交換機制以緩解圖形處理器上記憶體有限的問題。最後,我們比較並分析PASTA在不同資料集上與過去最好演算法的表現差異。實驗結果顯示PASTA無論在合成資料或真實資料上皆較過去最好的演算法好數十至數百倍。
Sequential pattern mining can be applied to various fields such as disease prediction and stock analysis.
Many different algorithms have been proposed for sequential pattern mining, together with acceleration methods. In this thesis, we argue that a heterogeneous platform with CPU and GPU is much more suitable for sequential pattern mining than traditional CPU-based approaches since the support counting process is inherently succinct and repetitive.
Therefore, we propose the PArallel SequenTial pAttern mining algorithm, referred to as PASTA, to accelerate sequential pattern mining by combining the merits of CPU and GPU computing.
Explicitly, PASTA adopts the vertical bitmap representation of database with an optimized expansion strategy to effectively explore and reduce the search space. In addition, a pipeline strategy is proposed to ensure that both CPU and CPU on the heterogeneous platform operate concurrently to fully utilize the computing power of the platform. Furthermore, we develop a dynamic swapping scheme to mitigate the limited memory problem of the GPU hardware without decreasing the performance.
Finally, a comprehensive experiment and a sensitivity test are conducted to analyze PASTA with different baselines. The experiments show that PASTA outperforms the state-of-the-art algorithms by orders of magnitude on both synthetic and real datasets.
URI: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/72309
DOI: 10.6342/NTU201803546
Fulltext Rights: 有償授權
Appears in Collections:電機工程學系

Files in This Item:
File SizeFormat 
ntu-107-1.pdf
  Restricted Access
3.02 MBAdobe PDF
Show full item record


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved