Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 電機工程學系
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/88512
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor郭斯彥zh_TW
dc.contributor.advisorSy-Yen Kuoen
dc.contributor.author張家鳴zh_TW
dc.contributor.authorChia-Ming Changen
dc.date.accessioned2023-08-15T16:37:56Z-
dc.date.available2023-11-09-
dc.date.copyright2023-08-15-
dc.date.issued2023-
dc.date.submitted2023-07-31-
dc.identifier.citation[1] A. Abidi, R. Zhou, L. Chen, and C. Liu. Pivot-based maximal biclique enumeration. In IJCAI, pages 3558–3564, 2020.
[2] A. Adinets. Cuda dynamic parallelism api and principles, 2014. Accessed June. 3, 2023.
[3] G. Alexe, S. Alexe, Y. Crama, S. Foldes, P. L. Hammer, and B. Simeone. Consensus algorithms for the generation of all maximal bicliques. Discrete Applied Mathematics, 145(1):11–21, 2004.
[4] M. Almasri, Y.-H. Chang, I. E. Hajj, R. Nagi, J. Xiong, and W.-m. Hwu. Parallelizing maximal clique enumeration on gpus. arXiv preprint arXiv:2212.01473, 2022.
[5] L. Chen, C. Liu, R. Zhou, J. Xu, and J. Li. Efficient maximal biclique enumeration for large sparse bipartite graphs. Proceedings of the VLDB Endowment, 15(8):1559–1571, 2022.
[6] A. Das, S.-V. Sanei-Mehri, and S. Tirthapura. Shared-memory parallel maximal clique enumeration. In 2018 IEEE 25th International Conference on High Performance Computing (HiPC), pages 62–71. IEEE, 2018.
[7] M. Harris. Using shared memory in cuda c/c++, 2013. Accessed June. 1, 2023.
[8] G. Liu, K. Sim, and J. Li. Efficient mining of large maximal bicliques. In Data Warehousing and Knowledge Discovery: 8th International Conference, DaWaK 2006, Krakow, Poland, September 4-8, 2006. Proceedings 8, pages 437–448. Springer, 2006.
[9] A. P. Mukherjee and S. Tirthapura. Enumerating maximal bicliques from a large graph using mapreduce. IEEE Transactions on Services Computing, 10(5):771–784, 2016.
[10] R. Nataraj and S. Selvan. Parallel mining of large maximal bicliques using order preserving generators. 2009.
[11] Nvidia. https://docs.nvidia.com/cuda/archive/9.0/, 2018. Accessed June. 1, 2023.
[12] Nvidia. Cuda toolkit, 2023. Accessed June. 3, 2023.
[13] R. Peeters. The maximum edge biclique problem is np-complete. Discrete Applied Mathematics, 131(3):651–654, 2003.
[14] E. Prisner. Bicliques in graphs i: Bounds on their number. Combinatorica, 20(1):109–117, 2000.
[15] X. Tang, A. Pattnaik, H. Jiang, O. Kayiran, A. Jog, S. Pai, M. Ibrahim, M. T. Kandemir, and C. R. Das. Controlled kernel launch for dynamic parallelism in gpus. In 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA), pages 649–660. IEEE, 2017.
[16] Y. Zhang, C. A. Phillips, G. L. Rogers, E. J. Baker, E. J. Chesler, and M. A. Langston. On finding bicliques in bipartite graphs: a novel algorithm and its application to the integration of diverse biological data types. BMC bioinformatics, 15:1–18, 2014.
-
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/88512-
dc.description.abstract本論文透過圖形處理器優化極大二分團列舉(Maximal Biclique Enumeration)的執行效能。過往最大二分團列舉之演算法皆採用深度優先搜尋技巧,同時需要大量的遞迴呼叫與集合操作。然而,因為圖形處理器在硬體上的限制,使得深度優先搜尋與遞迴呼叫並不可行,並且圖形處理器上的計憶體限制也使得過往的資料結構,會導致計憶體不足的問題。同時,在解決前述之硬體限制之後,也因為圖形處理器本身高平化度的特性,使得如何平均有效率的將工作量分配給每個執行單元,變成影響效能的最大問題之一。
本研究改寫過往之極大二分團列舉演算法,將遞迴呼叫改為迴圈形式,減緩在圖形處理器上呼叫函式所造成的高負擔;同時,也使用新的資料結構,以解決原集合在圖形處理器上不可用之問題,進而減少了宣告集合對記憶體的需求。後續進一步使用了輕量化候選項選擇、工作竊取等技巧,緩解在圖形處理器上執行緒區塊之間工作量不平衡之情形;並使用平行交集計算與雙層平行計算,進一步利用執行緒區塊內之平行度。在提出極大二分團列舉在圖形處理器上之可能性的同時,提高此問題在圖形處理器上之效能。
zh_TW
dc.description.abstractMaximal biclique is a crucial property in many applications, such as social network analysis and computational biology. By Maximal Biclique Enumeration (MBE), we can list all the maximal biclique in a given graph and thus extract the insight of it. Alone with the growth of graph size, previous studies have introduced different algorithms to solve the MBE problem with parallelism on CPUs to improve the scalability ; however, to the best of author’s knowledge, the problem has not been implemented on Graphic Processing Unit (GPU).
In this study, it aims to improve scalability and runtime of MBE through the parallelism provided by GPUs. First, to successfully migrate the MBE algorithm to the GPUs, it proposes a loop-based algorithm to replace the recursion, and then uses a new data structure to solve the memory explosion problem. Lastly, profiting from the parallelism of GPUs, it develops lightweight candidate selection and work-stealing techniques to alleviates the workload imbalance between thread blocks. Furthermore, it leverages parallel intersection computation and two-level parallelism within each thread block to effectively utilize the inherent parallelism of these blocks.
en
dc.description.provenanceSubmitted by admin ntu (admin@lib.ntu.edu.tw) on 2023-08-15T16:37:56Z
No. of bitstreams: 0
en
dc.description.provenanceMade available in DSpace on 2023-08-15T16:37:56Z (GMT). No. of bitstreams: 0en
dc.description.tableofcontentsVerification Letter from the Oral Examination Committee i
Acknowledgements ii
摘要 iv
Abstract v
Contents vii
Chapter 1 Introduction 1
Chapter 2 Background 5
2.1 GPU architecture 5
2.2 GPU memory hierarchy 7
2.3 Compute Unified Device Architecture 7
2.3.1 CUDA synchronization granularity 7
2.3.2 CUDA Dynamic Parallelism 8
2.4 Maximal Biclique Enumeration 9
Chapter 3 Related work 10
3.1 iMBEA 10
3.1.1 Tree Pruning 13
3.1.2 Candidate Selection 13
3.2 Share­Memory parMBE 14
3.2.1 First level recursion tree 14
3.2.2 Memory consumption 16
3.2.3 Parallel Intersection Computation 16
3.3 PMCE on GPUs 17
3.3.1 Compact Representation 17
Chapter 4 Migration the parallel MBE Algorithm to GPUs 18
4.1 Obstacles and Overview of Implementation 18
4.2 Loop­based DFS Control Flow on GPUs 19
4.3 Compact Representation in cuMBE 21
Chapter 5 Optimization of Parallel MBE Algorithm on GPUs 22
5.1 Workload Imbalance 22
5.2 Lightweight Candidate Selection 23
5.2.1 First Level Early Stop 24
5.2.2 Second Level Early Stop 25
5.3 Work Stealing 26
5.4 Parallel Intersection Computation 27
5.5 Two Level Parallelism 28
Chapter 6 Evaluation 30
6.1 Configurations 30
6.1.1 Platforms and Datasets 30
6.1.2 CPU Baselines 32
6.2 Overall Performance 32
6.3 Workload Balance 32
References 35
-
dc.language.isoen-
dc.subject二分團zh_TW
dc.subject圖形處理器zh_TW
dc.subject平行運算zh_TW
dc.subjectparallel computingen
dc.subjectGPUen
dc.subjectBicliqueen
dc.title在圖形處理器上之並行極大二分團列舉zh_TW
dc.titleParallel Maximal Biclique Enumeration on GPUsen
dc.typeThesis-
dc.date.schoolyear111-2-
dc.description.degree碩士-
dc.contributor.oralexamcommittee雷欽隆;顏嗣鈞;陳英一;林振緯zh_TW
dc.contributor.oralexamcommitteeChin-Laung Lei;Hsu-chun Yen;Ing-Yi Chen;Jenn-Wei Linen
dc.subject.keyword二分團,圖形處理器,平行運算,zh_TW
dc.subject.keywordBiclique,GPU,parallel computing,en
dc.relation.page38-
dc.identifier.doi10.6342/NTU202302235-
dc.rights.note同意授權(全球公開)-
dc.date.accepted2023-08-02-
dc.contributor.author-college電機資訊學院-
dc.contributor.author-dept電機工程學系-
顯示於系所單位:電機工程學系

文件中的檔案:
檔案 大小格式 
ntu-111-2.pdf794.17 kBAdobe PDF檢視/開啟
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved