Arm和RISC-V架構編譯器自動向量化和超字組平行的設計與分析

彭旻翊; Ming-Yi Peng

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/92163

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	廖世偉	zh_TW
dc.contributor.advisor	Shih-Wei Liao	en
dc.contributor.author	彭旻翊	zh_TW
dc.contributor.author	Ming-Yi Peng	en
dc.date.accessioned	2024-03-07T16:22:36Z	-
dc.date.available	2024-03-08	-
dc.date.copyright	2024-03-07	-
dc.date.issued	2024	-
dc.date.submitted	2024-02-18	-
dc.identifier.citation	[1] N. Stephens, S. Biles, M. Boettcher, J. Eapen, M. Eyole G. Gabrielli, M. Horsnell, G.Magklis, A. Martinez, N. Premillieu, A. Reid, A. Rico, P. Walker, "The ARM Scalable Vector Extension," IEEE Micro, vol. 37, pp. 26-39, 2017, doi: 10.1109/mm.2017.35. [2] M. Perotti, M. Cavalcante, N. Wistoff, R. Andri, L. Cavigelli, and L. Benini, "A “New Ara” for Vector Computing: An Open Source Highly Efficient RISC-V V 1.0 Vector Processor Design," presented at the 2022 IEEE 33rd International Conference on Application-specific Systems, Architectures and Processors (ASAP), 2022. [3] R. Allen and K. Kennedy, "PFC: A program to convert Fortran to parallel form," Department of Mathematical Sciences, Rice University, Technical Report, 1982. [4] R. Allen and K. Kennedy, "Automatic translation of FORTRAN programs to vector form," ACM Trans. Program. Lang. Syst., vol. 9, no. 4, pp. 491–542, 1987, doi: 10.1145/29873.29875. [5] V. Porpodas, R. C. O. Rocha, E. Brevnov, L. F. W. Góes, and T. Mattson, "Super-Node SLP: Optimized Vectorization for Code Sequences Containing Operators and Their Inverse Elements," in 2019 IEEE/ACM International Symposium on Code Generation and Optimization (CGO), 16-20 Feb. 2019 2019, pp. 206-216, doi: 10.1109/CGO.2019.8661192. [6] D. Callahan, J. J. Dongarra, and D. Levine, "Vectorizing compilers: a test suite and results," Proceedings. SUPERCOMPUTING ''88, pp. 98-105, 1988. [7] C. Lattner and V. Adve, "LLVM: a compilation framework for lifelong program analysis & transformation," in International Symposium on Code Generation and Optimization, 2004. CGO 2004., 20-24 March 2004 2004, pp. 75-86, doi: 10.1109/CGO.2004.1281665. [8] C. Lattner et al., "MLIR: A Compiler Infrastructure for the End of Moore''s Law," CoRR, vol. abs/2002.11054, / 2020. [9] C. Lattner et al., "MLIR: Scaling Compiler Infrastructure for Domain Specific Computation," in 2021 IEEE/ACM International Symposium on Code Generation and Optimization (CGO), 27 Feb.-3 March 2021 2021, pp. 2-14, doi: 10.1109/CGO51591.2021.9370308. [10] S.-L. Wu, X.-Y. Wang, M.-Y. Peng, and S.-W. Liao, "Accelerating OpenVX through Halide and MLIR," Journal of Signal Processing Systems, vol. 95, no. 5, pp. 571-584, 2023/05/01 2023, doi: 10.1007/s11265-022-01826-8. [11] S. Larsen and S. Amarasinghe, "Exploiting superword level parallelism with multimedia instruction sets," presented at the Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation, Vancouver, British Columbia, Canada, 2000. [12] N. D. Rosen I, Zaks A., "Loop-aware SLP in GCC," presented at the GCC and GNU Toolchain Developers'' Summit 2007, Ottawa, ON, Canada, 2007. [13] K. D. Cooper and L. Torczon, Engineering A Compiler 2nd Edition. 2012. [14] The LLVM Compiler Infrastructure. "Iterating over def-use & use-def chains." https://llvm.org/docs/ProgrammersManual.html#iterating-over-def-use-use-def-chains. [15] F. Bellard, "QEMU, a fast and portable dynamic translator," presented at the Proceedings of the annual conference on USENIX Annual Technical Conference, Anaheim, CA, 2005. [16] Y. Chen, C. Mendis, M. Carbin, and S. Amarasinghe, "VeGen: a vectorizer generator for SIMD and beyond," presented at the Proceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Virtual, USA, 2021. [17] C. Mendis, C. Yang, Y. Pu, S. Amarasinghe, and M. Carbin, "Compiler auto-vectorization with imitation learning," in Proceedings of the 33rd International Conference on Neural Information Processing Systems: Curran Associates Inc., 2019, p. Article 1310. [18] V. Porpodas, R. C. O. Rocha, and L. F. W. Góes, "VW-SLP: auto-vectorization with adaptive vector width," presented at the Proceedings of the 27th International Conference on Parallel Architectures and Compilation Techniques, Limassol, Cyprus, 2018. [19] C. Mendis and S. Amarasinghe, "goSLP: globally optimized superword level parallelism framework," Proc. ACM Program. Lang., vol. 2, no. OOPSLA, p. Article 110, 2018, doi: 10.1145/3276480. [20] N. Adit and A. Sampson, "Performance Left on the Table: An Evaluation of Compiler Autovectorization for RISC-V," IEEE Micro, vol. 42, no. 5, pp. 41-48, 2022, doi: 10.1109/MM.2022.3184867.	-
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/92163	-
dc.description.abstract	在當今資料驅動的時代，處理大數據的需求也迅速提升，因此程式執行效能成為重要的研究方向。現代處理器普遍配備了單指令多資料流（Single Instruction, Multiple Data, SIMD）處理元件，且各種指令集架構均支持相對應的向量擴展指令集，例如Arm支援Neon與 SVE，RISC-V支援RVV。自動向量化為一種編譯器最佳化技術，可以讓程式碼在編譯階段自動轉換為向量指令，從而充分發揮向量處理單元的性能，提高程式運行效率。本研究探討了LLVM編譯器內實現的超字組平行（Superword Level Parallelism, SLP）自動向量化技術，並且針對當前演算法中尚未涵蓋的領域進行改良最佳化，以拓寬SLP技術的應用場景。進一步地，本研究分別在Arm和RISC-V架構上進行了模擬效能測試，以驗證改良演算法的實際效益。	zh_TW
dc.description.abstract	In the data-driven era, the demand for processing large datasets has rapidly increased, and execution efficiency has become an important research direction. Modern processors are commonly equipped with Single Instruction, Multiple Data (SIMD) processing units, and various instruction set architectures support corresponding vector extension instruction sets. Auto-vectorization is a compiler optimization technique that enables program codes to be automatically translated into vector instructions during the compilation stage, thereby fully utilizing the vector processing unit and enhancing the efficiency of program execution. We investigate the superword level parallelism (SLP) auto-vectorization implemented in the LLVM compiler and refine the existing algorithm in order to expand the scope of its application scenarios. Furthermore, this research has conducted performance simulation and evaluations on Arm and RISC-V architectures to verify the actual benefits of the improved algorithm.	en
dc.description.provenance	Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2024-03-07T16:22:36Z No. of bitstreams: 0	en
dc.description.provenance	Made available in DSpace on 2024-03-07T16:22:36Z (GMT). No. of bitstreams: 0	en
dc.description.tableofcontents	口試委員會審定書 i 摘要 ii ABSTRACT iii TABLE OF CONTENTS iv LIST OF FIGURES v LIST OF TABLES vi CHAPTER 1. INTRODUCTION 1 CHAPTER 2. BACKGROUND 3 2.1 LLVM 3 2.2 AUTO-VECTORIZATION 4 2.3 SUPERWORD LEVEL PARALLELISM 6 2.4 ARM 8 2.5 RISC-V 9 CHAPTER 3. MOTIVATION 10 3.1 MOTIVATION 10 CHAPTER 4. METHOD 14 4.1 RECURSIVELY BUILDING THE VECTORIZATION TREE 14 4.2 CHECKING COMMUTATIVITY 16 4.3 COMMUTATIVE REORDERING 17 CHAPTER 5. EVALUATION AND DISCUSSION 19 5.1 ENVIRONMENTAL SETUP 19 5.2 EVALUATION AND DISCUSSION 20 CHAPTER 6. CONCLUSION AND FUTURE WORK 23 6.1 CONCLUSION 23 6.2 RELATED WORK 24 6.3 FUTURE WORK 25 REFERENCES 26	-
dc.language.iso	en	-
dc.subject	單指令多資料流	zh_TW
dc.subject	自動向量化	zh_TW
dc.subject	超字組平行	zh_TW
dc.subject	LLVM	zh_TW
dc.subject	Arm	zh_TW
dc.subject	RISC-V	zh_TW
dc.subject	Arm	en
dc.subject	SLP	en
dc.subject	LLVM	en
dc.subject	RISC-V	en
dc.subject	SIMD	en
dc.subject	Auto-vectorization	en
dc.title	Arm和RISC-V架構編譯器自動向量化和超字組平行的設計與分析	zh_TW
dc.title	Design and Analysis of Compiler Auto-vectorization and Superword Level Parallelism on Arm and RISC-V Architecture	en
dc.type	Thesis	-
dc.date.schoolyear	112-1	-
dc.description.degree	碩士	-
dc.contributor.oralexamcommittee	傅楸善;黃敬群;鄭振牟	zh_TW
dc.contributor.oralexamcommittee	Chiou-Shann Fuh;Ching-Chun Huang;Chen-Mou Cheng	en
dc.subject.keyword	單指令多資料流,自動向量化,超字組平行,LLVM,Arm,RISC-V,	zh_TW
dc.subject.keyword	SIMD,Auto-vectorization,SLP,LLVM,Arm,RISC-V,	en
dc.relation.page	28	-
dc.identifier.doi	10.6342/NTU202400692	-
dc.rights.note	未授權	-
dc.date.accepted	2024-02-18	-
dc.contributor.author-college	電機資訊學院	-
dc.contributor.author-dept	資訊工程學系	-
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
ntu-112-1.pdf 未授權公開取用	2.57 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。