請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/51216
完整後設資料紀錄
DC 欄位 | 值 | 語言 |
---|---|---|
dc.contributor.advisor | 王偉仲(Weichung Wang) | |
dc.contributor.author | Po-Chuan Wang | en |
dc.contributor.author | 王柏川 | zh_TW |
dc.date.accessioned | 2021-06-15T13:27:42Z | - |
dc.date.available | 2018-03-08 | |
dc.date.copyright | 2016-03-08 | |
dc.date.issued | 2016 | |
dc.date.submitted | 2016-02-15 | |
dc.identifier.citation | [1] P. R. Amestoy, I. S. Duff, J.-Y. L’Excellent, and J. Koster. Mumps: a general purpose distributed memory sparse solver. In Applied Parallel Computing. New Paradigms for HPC in Industry and Academia, pages 121–130. Springer, 2001.
[2] S. Balay, S. Abhyankar, M. F. Adams, J. Brown, P. Brune, K. Buschelman, L. Dalcin, V. Eijkhout, W. D. Gropp, D. Kaushik, M. G. Knepley, L. C. McInnes, K. Rupp, B. F. Smith, S. Zampini, and H. Zhang. PETSc users manual. Technical Report ANL-95/11 - Revision 3.6, Argonne National Laboratory, 2015. [3] C.Che-Ming.Hybridhierarchicalschursolversforlargesparselinearsystemsoncpu- cpu cluster. Master’s thesis, Department of Mathematics, National Taiwan University, 2014. [4] Y. Chen, T. A. Davis, W. W. Hager, and S. Rajamanickam. Algorithm 887: Cholmod, supernodal sparse cholesky factorization and update/downdate. ACM Trans. Math. Softw., 35(3):22:1–22:14, Oct. 2008. [5] D. Y. Chenhan, W. Wang, and D. Pierce. A cpu–gpu hybrid approach for the unsym- metric multifrontal method. Parallel Computing, 37(12):759–770, 2011. [6] P. Labs. Paralution v1.0.0, 2015. http://www.paralution.com/. [7] O. Schenk, K. Gartner, W. Fichtner, and A. Stricker. Pardiso: a high-performance serial and parallel sparse linear solver in semiconductor device simulation. Future Generation Computer Systems, 18(1):69–78, 2001. [8] I. Yamazaki and X. S. Li. On techniques to improve robustness and scalability of the schur complement method. | |
dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/51216 | - |
dc.description.abstract | 解稀疏線性系統是科學計算領域中最核心的問題之一,隨著問題的規模增加,有效率的解稀疏線性系統是必要的。在現今的計算機架構中,計算核心的時脈由於物理限制而難以大幅提升,計算單元的設計傾向以多核心平行達到提升效能的目的。為了最大化計算資源的使用以達到更高的效率,演算法必須針對平行的架構設計演算法。多層舒爾法藉由分析以多重巢狀分割法重排後稀疏矩陣的分塊結構,將矩陣直接解法的分解過程拆解為可平行的子問題,各個子問題中再適當地以不同的方法平行加速。在此之上,經由分析各個子問題的所需計算量,再以優化過的機制將子問題均衡的分配到不同的處理器上,進而達到更高的可擴展性。 | zh_TW |
dc.description.abstract | Sparse linear system solver is one of the core of scientific computing. As the scale of problem increases, to solve sparse linear systems efficiently is necessary.
In recent computer architecture, the frequency of computation core is bounded by physical limitations, thus current design of computatation unit as CPU and GPU use multiple cores to improve the performance. Hierarchical Schur method expolits the block structure of multilevel nested dissection reordered sparse linear system and decompose the direct matrix factorization scheme into concurrent subproblems. In each subproblems we properly applied different techniques for lower level parallelism. Moreover by analyzing the computation cost of each subproblems, it is able to distribute the computation load to different resources to improve overall scalability. | en |
dc.description.provenance | Made available in DSpace on 2021-06-15T13:27:42Z (GMT). No. of bitstreams: 1 ntu-105-R02246011-1.pdf: 444900 bytes, checksum: bad346faafcba5e9ef15392c3c246c83 (MD5) Previous issue date: 2016 | en |
dc.description.tableofcontents | Contents
口試委員會審定書 i 摘要 ii Abstract iii 1 Introduction 1 1.1 SparseLinearSolvers............................ 1 1.2 RelatedWorks................................ 2 1.3 Purpose................................... 3 2 Background 4 2.1 NestedDissection.............................. 4 2.2 Schur’sComplementMethod........................ 8 3 Hierarchical Schur Algorithm 9 3.1 HiSAlgorithm ............................... 9 3.1.1 FactorizationAlgorithm ...................... 10 3.1.2 SolvingAlgorithm ......................... 11 3.2 Example................................... 12 3.3 TaskDistribution .............................. 17 3.3.1 CommunicationAvoidingDistribution. . . . . . . . . . . . . . . 17 3.3.2 Computation Load Balancing Distribution . . . . . . . . . . . . . 17 4 Implementation 19 4.1 C++Implementation ............................ 19 4.1.1 ObjectOrientedDesign....................... 19 4.1.2 C++11Standard .......................... 20 4.1.3 Template Metaprogramming and Compile Time Polymorphism . 20 4.1.4 MultilevelParallelism ....................... 20 4.2 SparseFactorization............................. 21 4.2.1 CPUSparseSolverandGPUSparseSolver . . . . . . . . . . . . 22 4.2.2 Schur’s Update Computation in Factorization . . . . . . . . . . . 22 4.3 DenseFactorization............................. 22 4.3.1 GPUDenseSolver ......................... 22 4.4 CompressedDenseBlockMatrix...................... 23 4.4.1 PurposeandIntroduction...................... 23 5 Results 24 5.1 TestEnvironment.............................. 24 5.2 TestProblems................................ 24 5.3 PerformanceResults ............................ 27 5.3.1 SequentialPerformance ...................... 27 5.3.2 ParallelPerformance........................ 28 6 Conclusion 30 6.1 FutureWorks ................................ 30 6.1.1 GeneralLinearSystemSolver ................... 30 6.1.2 InterprocessTaskSupport ..................... 31 Bibliography 32 | |
dc.language.iso | en | |
dc.title | 以分級平行多層舒爾法於CUDA叢集解線性系統 | zh_TW |
dc.title | Scalable Hierarchical Schur Linear System Solver with Multilevel Parallelism on CUDA Enabled Clusters | en |
dc.type | Thesis | |
dc.date.schoolyear | 104-1 | |
dc.description.degree | 碩士 | |
dc.contributor.oralexamcommittee | 黃聰明(Tsung-Ming Huang),李哲榮(Che-Rung Lee) | |
dc.subject.keyword | 線性系統,圖形處理器,舒爾法,巢狀分割法,直接法, | zh_TW |
dc.subject.keyword | Linear system,GPU,Nested dissection method,Direct method, | en |
dc.relation.page | 33 | |
dc.rights.note | 有償授權 | |
dc.date.accepted | 2016-02-15 | |
dc.contributor.author-college | 理學院 | zh_TW |
dc.contributor.author-dept | 應用數學科學研究所 | zh_TW |
顯示於系所單位: | 應用數學科學研究所 |
文件中的檔案:
檔案 | 大小 | 格式 | |
---|---|---|---|
ntu-105-1.pdf 目前未授權公開取用 | 434.47 kB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。