Affine calculator程式平行化計算器實作於GCC編譯器

Ying-Jie Wang; 王穎杰

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/43805

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	廖世偉
dc.contributor.author	Ying-Jie Wang	en
dc.contributor.author	王穎杰	zh_TW
dc.date.accessioned	2021-06-15T02:29:10Z	-
dc.date.available	2019-08-20
dc.date.copyright	2009-08-19
dc.date.issued	2009
dc.date.submitted	2009-08-17
dc.identifier.citation	[1] Compilers: Principles, Techniques and Tools (2nd Edition, 2006) (the 'Dragon Book') by Alfred V. Aho, Monica S. Lam, Ravi Sethi, and Jeffrey D. Ullman(ISBN 0-321-48681-1) [2] A. Lim, S. Liao, and M. Lam. Blocking and array contraction across arbitrarily nested loops using affine partitioning. In ACM SIGPLAN PPoPP, pages 103–112, 2001. [3] A. W. Lim, G. I. Cheong, and M. S. Lam. An affine partitioning algorithm to maximize parallelism and minimize communication. In ACM Intl. Conf. on Supercomputing, pages 228–237, 1999. [4] A. W. Lim and M. S. Lam. Maximizing parallelism and minimizing synchronization with affine partitions. Parallel Computing, 24(3-4):445–475, 1998. [5] R. Andonov, S. Balev, S. Rajopadhye, and N. Yanev. Optimal Semioblique tiling. IEEE Trans. Par. & Dist. Sys., 14(9):944–960, 2003. [6] C. Bastoul. Code generation in the polyhedral model is easier than you think. In IEEE Intl. Conf. on Parallel Architectures and Compilation Techniques (PACT’04), pages 7–16, Sept. 2004. [7] C. Bastoul and P. Feautrier. Improving data locality by chunking. In Intl. Conf. on Compiler Construction (ETAPS CC), pages 320–335, Warsaw, Apr. 2003. [8] U. Bondhugula, M. Baskaran, S. Krishnamoorthy, J. Ramanujam, A. Rountev, and P. Sadayappan. Affine transformations for communication minimal parallelization and locality optimization of arbitrarily-nested loop sequences. Technical Report OSU-CISRC-5/07-TR43, The Ohio State University, May 2007. [9] U. Bondhugula, M. Baskaran, S. Krishnamoorthy, J. Ramanujam, A. Rountev, and P. Sadayappan. Automatic transformations for communication-minimized parallelization and locality optimization in the polyhedral model. In Intl. Conf. on Compiler Construction(ETAPS CC), Apr. 2008. [10] U. Bondhugula, J. Ramanujam, and P. Sadayappan. Pluto: A practical and fully automatic polyhedral parallelizer and locality optimizer. Technical Report OSU-CISRC-10/07-TR70, The Ohio State University, Oct. 2007. [11] Uday Bondhugula, Albert Hartono, J. Ramanujam, and P. Sadayappan. A practical automatic polyhedral program optimization system. In ACM SIGPLAN Conference on Programming Language Design and Implementation, June 2008. [12] P. Boulet, A. Darte, G.-A. Silber, and F. Vivien. Loop parallelization algorithms: From parallelism extraction to code generation. Parallel Computing, 24(3–4):421–444, 1998. [13] A. Cohen, S. Girbal, D. Parello, M. Sigler, O. Temam, and N. Vasilache. Facilitating the search for compositions of program transformations. In ACM Intl. Conf. on Supercomputing, pages 151–160, June 2005. [14] P. Feautrier. Parametric integer programming. RAIRO Recherche Op’erationnelle, 22(3):243–268, 1988. [15] P. Feautrier. Dataflow analysis of scalar and array references. Intl. J. of Parallel Programming, 20(1):23–53, Feb. 1991. [16] P. Feautrier. Some efficient solutions to the affine scheduling problem: I. one-dimensional time. Intl. J. of Parallel Programming, 21(5):313–348, 1992. [17] P. Feautrier. Some efficient solutions to the affine scheduling problem. part II. multidimensional time. Intl. J. of Parallel Programming, 21(6):389–420, 1992. [18] R. Allen and K. Kennedy. Automatic translation of Fortran programs to vector form. ACM Trans. on Programming Languages and Systems, 9(4):491–542, 1987. [19] N. Ahmed, N. Mateev, and K. Pingali. Synthesizing transformations for locality enhancement of imperfectly-nested loops. Intl. J. of Parallel Programming, 29(5), Oct. 2001. [20] C. Ancourt and F. Irigoin. Scanning polyhedra with do loops. In ACM SIGPLAN PPoPP’91, pages 39–50, 1991.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/43805	-
dc.description.abstract	雖然多核心處理器架構越來越普遍，但是在這種架構上撰寫程式並有效發揮每個處理單元的運算能力仍然是一件困難的事情。數個研究主題目標在於解決此類問題，自動平行化即是其中強而有力但同時也是相當困難的解決方法之一。 Affine partitioning 提供了一種有系統的方法，能夠在多核心處理器的架構上找到近似於最佳的計算與資料分解法。 Affine partitioning 是一個強大的統一常論，它的架構融合了許多高階的優化方法，例如: 迴圈調換，迴圈反轉，迴圈融合，迴圈分離等許多方法。基於這個統一的架構，能夠在一個由仿射資料參考所組成的巢狀迴圈中找到最大的平行度同時將程式中的通訊降到最小。在這篇論文中，我們提出一個GCC編譯器中的分析過程，我們稱為” affine calculator pass”，這個分析過程將affine partitioning 演算法和GCC編譯器整合，達到了自動平型化的目標。我們成功的編譯了數個以C語言撰寫的程式，這些程式由包含不同型態的陣列存取以及不同類型的資料相依的巢狀迴圈所組成。完成編譯後輸出的是一個執行檔，每個執行檔都使用了GOMP 函式庫來達到充分利用處理器的每個運算處理單元。	zh_TW
dc.description.abstract	Multicore processors have become pervasive in these days. But, it is still difficult to program these architectures to effectively utilize the computation power of multiple processing units. There are several research topics addressing this issue, one of the strong and at the same time very hard approaches is automatic parallelization. Affine partitioning provides a systematic framework to find asymptotically optimal computation and data decomposition for multicore processors. Affine partitioning is a powerful unifying theory, the framework uniformly models a large class of high level optimizations such as loop interchange, reversal, skewing, fusion, fission, re-indexing, scaling. Based on this unified framework, that maximize parallelism while minimizing communication in programs with arbitrary loop nestings and affine data accesses. In this papaer, we proposed a compiler pass in GCC called “affine calculator pass” which integrates affine partitioning framework and GCC to achieve the goal of automatic parallelization. We have successfully compiled some programs in C language which composed of different types of array access dependencies in arbitrary nested loop. And the outputs of the compilation are executables. Each of these executables can execute in parallel using GOMP library to utilize the processing units of multicore processor.	en
dc.description.provenance	Made available in DSpace on 2021-06-15T02:29:10Z (GMT). No. of bitstreams: 1 ntu-98-R96922078-1.pdf: 641280 bytes, checksum: 795715185368d569dfda19072410ded9 (MD5) Previous issue date: 2009	en
dc.description.tableofcontents	Acknowledgements I Chinese Abstract II Abstract III List of Figures V Contents VI Chapter 1 Motivation and Introduction 1 1.1. Motivation 1 1.2. Introduction 2 Chapter 2 Background 4 2.1. Basic Concepts 4 2.1.1. Linear 4 2.1.2. Affine 4 2.1.3. Polyhedron, Polytope 5 2.2. Introduction to Affine Calculator 6 2.2.1. Affine Partitioning 6 2.2.2. GCC(Gnu Compiler Collection) 13 2.2.3. The GIMPLE Tree in GCC 14 Chapter 3 Overview 16 3.1. Parsing Time 17 3.2. Parse the Affine Expression and Mapping It 18 3.3. Add COND_EXPR 19 3.4. Add OpenMP Directives 21 Chapter 4 Experiment Result 25 4.1. Four Case Studies 25 4.1.1. Simple 25 4.1.2. Matrix Multiplication 27 4.1.3. Btrix 27 4.1.4. Black and White 28 Chapter 5 Conclusion 30 Chapter 6 Future Work 31 Appendix 32 Matrix Multiplication 32 Btrix 35 Black and White 41 References 45
dc.language.iso	en
dc.subject	編譯器	zh_TW
dc.subject	自動平行化	zh_TW
dc.subject	Auto parallelization	en
dc.subject	OpenMP	en
dc.subject	Affine partitioning	en
dc.subject	Compiler	en
dc.subject	GCC	en
dc.title	Affine calculator程式平行化計算器實作於GCC編譯器	zh_TW
dc.title	Affine calculator in GCC	en
dc.type	Thesis
dc.date.schoolyear	97-2
dc.description.degree	碩士
dc.contributor.oralexamcommittee	陳健輝,陳俊良
dc.subject.keyword	自動平行化,編譯器,	zh_TW
dc.subject.keyword	Auto parallelization,Compiler,GCC,Affine partitioning,OpenMP,	en
dc.relation.page	47
dc.rights.note	有償授權
dc.date.accepted	2009-08-17
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	資訊工程學研究所	zh_TW
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
ntu-98-1.pdf 未授權公開取用	626.25 kB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。