Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 理學院
  3. 應用數學科學研究所
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/51216
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor王偉仲(Weichung Wang)
dc.contributor.authorPo-Chuan Wangen
dc.contributor.author王柏川zh_TW
dc.date.accessioned2021-06-15T13:27:42Z-
dc.date.available2018-03-08
dc.date.copyright2016-03-08
dc.date.issued2016
dc.date.submitted2016-02-15
dc.identifier.citation[1] P. R. Amestoy, I. S. Duff, J.-Y. L’Excellent, and J. Koster. Mumps: a general purpose distributed memory sparse solver. In Applied Parallel Computing. New Paradigms for HPC in Industry and Academia, pages 121–130. Springer, 2001.
[2] S. Balay, S. Abhyankar, M. F. Adams, J. Brown, P. Brune, K. Buschelman, L. Dalcin, V. Eijkhout, W. D. Gropp, D. Kaushik, M. G. Knepley, L. C. McInnes, K. Rupp, B. F. Smith, S. Zampini, and H. Zhang. PETSc users manual. Technical Report ANL-95/11 - Revision 3.6, Argonne National Laboratory, 2015.
[3] C.Che-Ming.Hybridhierarchicalschursolversforlargesparselinearsystemsoncpu- cpu cluster. Master’s thesis, Department of Mathematics, National Taiwan University, 2014.
[4] Y. Chen, T. A. Davis, W. W. Hager, and S. Rajamanickam. Algorithm 887: Cholmod, supernodal sparse cholesky factorization and update/downdate. ACM Trans. Math. Softw., 35(3):22:1–22:14, Oct. 2008.
[5] D. Y. Chenhan, W. Wang, and D. Pierce. A cpu–gpu hybrid approach for the unsym- metric multifrontal method. Parallel Computing, 37(12):759–770, 2011.
[6] P. Labs. Paralution v1.0.0, 2015. http://www.paralution.com/.
[7] O. Schenk, K. Gartner, W. Fichtner, and A. Stricker. Pardiso: a high-performance serial and parallel sparse linear solver in semiconductor device simulation. Future Generation Computer Systems, 18(1):69–78, 2001.
[8] I. Yamazaki and X. S. Li. On techniques to improve robustness and scalability of the schur complement method.
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/51216-
dc.description.abstract解稀疏線性系統是科學計算領域中最核心的問題之一,隨著問題的規模增加,有效率的解稀疏線性系統是必要的。在現今的計算機架構中,計算核心的時脈由於物理限制而難以大幅提升,計算單元的設計傾向以多核心平行達到提升效能的目的。為了最大化計算資源的使用以達到更高的效率,演算法必須針對平行的架構設計演算法。多層舒爾法藉由分析以多重巢狀分割法重排後稀疏矩陣的分塊結構,將矩陣直接解法的分解過程拆解為可平行的子問題,各個子問題中再適當地以不同的方法平行加速。在此之上,經由分析各個子問題的所需計算量,再以優化過的機制將子問題均衡的分配到不同的處理器上,進而達到更高的可擴展性。zh_TW
dc.description.abstractSparse linear system solver is one of the core of scientific computing. As the scale of problem increases, to solve sparse linear systems efficiently is necessary.
In recent computer architecture, the frequency of computation core is bounded
by physical limitations, thus current design of computatation unit as CPU and
GPU use multiple cores to improve the performance.
Hierarchical Schur method expolits the block structure of multilevel nested
dissection reordered sparse linear system and decompose the direct matrix
factorization scheme into concurrent subproblems. In each subproblems we
properly applied different techniques for lower level parallelism. Moreover
by analyzing the computation cost of each subproblems, it is able to
distribute the computation load to different resources to improve overall
scalability.
en
dc.description.provenanceMade available in DSpace on 2021-06-15T13:27:42Z (GMT). No. of bitstreams: 1
ntu-105-R02246011-1.pdf: 444900 bytes, checksum: bad346faafcba5e9ef15392c3c246c83 (MD5)
Previous issue date: 2016
en
dc.description.tableofcontentsContents
口試委員會審定書 i
摘要 ii
Abstract iii
1 Introduction 1
1.1 SparseLinearSolvers............................ 1
1.2 RelatedWorks................................ 2
1.3 Purpose................................... 3
2 Background 4
2.1 NestedDissection.............................. 4
2.2 Schur’sComplementMethod........................ 8
3 Hierarchical Schur Algorithm 9
3.1 HiSAlgorithm ............................... 9
3.1.1 FactorizationAlgorithm ...................... 10
3.1.2 SolvingAlgorithm ......................... 11
3.2 Example................................... 12
3.3 TaskDistribution .............................. 17
3.3.1 CommunicationAvoidingDistribution. . . . . . . . . . . . . . . 17
3.3.2 Computation Load Balancing Distribution . . . . . . . . . . . . . 17
4 Implementation 19
4.1 C++Implementation ............................ 19
4.1.1 ObjectOrientedDesign....................... 19
4.1.2 C++11Standard .......................... 20
4.1.3 Template Metaprogramming and Compile Time Polymorphism . 20
4.1.4 MultilevelParallelism ....................... 20
4.2 SparseFactorization............................. 21
4.2.1 CPUSparseSolverandGPUSparseSolver . . . . . . . . . . . . 22
4.2.2 Schur’s Update Computation in Factorization . . . . . . . . . . . 22
4.3 DenseFactorization............................. 22
4.3.1 GPUDenseSolver ......................... 22
4.4 CompressedDenseBlockMatrix...................... 23
4.4.1 PurposeandIntroduction...................... 23
5 Results 24
5.1 TestEnvironment.............................. 24
5.2 TestProblems................................ 24
5.3 PerformanceResults ............................ 27
5.3.1 SequentialPerformance ...................... 27
5.3.2 ParallelPerformance........................ 28
6 Conclusion 30
6.1 FutureWorks ................................ 30
6.1.1 GeneralLinearSystemSolver ................... 30
6.1.2 InterprocessTaskSupport ..................... 31
Bibliography 32
dc.language.isoen
dc.subject圖形處理器zh_TW
dc.subject線性系統zh_TW
dc.subject舒爾法zh_TW
dc.subject巢狀分割法zh_TW
dc.subject直接法zh_TW
dc.subject線性系統zh_TW
dc.subject圖形處理器zh_TW
dc.subject舒爾法zh_TW
dc.subject巢狀分割法zh_TW
dc.subject直接法zh_TW
dc.subjectLinear systemen
dc.subjectDirect methoden
dc.subjectNested dissection methoden
dc.subjectGPUen
dc.subjectLinear systemen
dc.subjectDirect methoden
dc.subjectNested dissection methoden
dc.subjectGPUen
dc.title以分級平行多層舒爾法於CUDA叢集解線性系統zh_TW
dc.titleScalable Hierarchical Schur Linear System Solver with Multilevel Parallelism on CUDA Enabled Clustersen
dc.typeThesis
dc.date.schoolyear104-1
dc.description.degree碩士
dc.contributor.oralexamcommittee黃聰明(Tsung-Ming Huang),李哲榮(Che-Rung Lee)
dc.subject.keyword線性系統,圖形處理器,舒爾法,巢狀分割法,直接法,zh_TW
dc.subject.keywordLinear system,GPU,Nested dissection method,Direct method,en
dc.relation.page33
dc.rights.note有償授權
dc.date.accepted2016-02-15
dc.contributor.author-college理學院zh_TW
dc.contributor.author-dept應用數學科學研究所zh_TW
顯示於系所單位:應用數學科學研究所

文件中的檔案:
檔案 大小格式 
ntu-105-1.pdf
  未授權公開取用
434.47 kBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved