整合 LLVM SandboxIR 於超字階層平行性向量化器以支援IR還原機制藉以提升儲存指令鏈向量化效能

許庭偉; Ting-Wei Hsu

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/98248

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	廖世偉	zh_TW
dc.contributor.advisor	Shih-Wei Liao	en
dc.contributor.author	許庭偉	zh_TW
dc.contributor.author	Ting-Wei Hsu	en
dc.date.accessioned	2025-07-31T16:06:00Z	-
dc.date.available	2025-08-01	-
dc.date.copyright	2025-07-31	-
dc.date.issued	2025	-
dc.date.submitted	2025-07-29	-
dc.identifier.citation	[1] The llvm compiler infrastructure. https://llvm.org. Accessed: 2025-06-07. [2] Llvm language reference manual. https://llvm.org/docs/LangRef.html. Accessed: 2025-07-08. [3] Llvm value class. https://llvm.org/doxygen/classllvm_1_1Value.html. Accessed: 2025-07-08. [4] Sandbox ir: A transactional layer over llvm ir. https://llvm.org/docs/SandboxIR.html. Accessed: 2025-06-07. [5] L. Carpentieri, M. VazirPanah, and B. Cosenza. A performance analysis of autovectorization on rvv risc-v boards. In 2025 33rd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP), pages 129–136. IEEE, 2025. [6] C.-C. Chen. Optimizing slice selection strategy for store vectorization in superword level parallelism. Master thesis, National Taiwan University, Taipei, Taiwan, 2024. [7] K.-H. Chen, B.-Y. Shen, and W. Yang. An automatic superword vectorization in llvm. In 16th Workshop on Compiler Techniques for High-Performance and Embedded Computing, pages 19–27, 2010. [8] Y. Chen, C. Mendis, and S. Amarasinghe. All you need is superword-level parallelism: systematic control-flow vectorization with slp. In Proceedings of the 43rd ACM SIGPLAN International Conference on Programming Language Design and Implementation, pages 301–315, 2022. [9] J. Feng, Y. He, Q. Tao, and H. Ma. An slp vectorization method based on equivalent extended transformation. Wireless Communications and Mobile Computing, 2022(1):1832522, 2022. [10] R. Kumar, A. Martínez, and A. Gonzalez. Assisting static compiler vectorization with a speculative dynamic vectorizer in an hw/sw codesigned environment. ACM Transactions on Computer Systems (TOCS), 33(4):1–33, 2016. [11] C. Mendis and S. Amarasinghe. goslp: globally optimized superword level parallelism framework. Proceedings of the ACM on Programming Languages, 2(OOPSLA):1–28, 2018. [12] K. Modin. A comparison of auto-vectorization performance between gcc and llvm for the risc-v vector extension, 2024. [13] D. Naishlos. Autovectorization in gcc. In Proceedings of the 2004 GCC developers summit, pages 105–118. Citeseer, 2004. [14] D. Nuzman and R. Henderson. Multi-platform auto-vectorization. In International Symposium on Code Generation and Optimization (CGO’06), pages 11–pp. IEEE, 2006. [15] D. Nuzman, I. Rosen, and A. Zaks. Auto-vectorization of interleaved data for simd. ACM SIGPLAN Notices, 41(6):132–143, 2006. [16] A. Pajuelo, A. González, and M. Valero. Speculative dynamic vectorization. ACM SIGARCH Computer Architecture News, 30(2):271–280, 2002. [17] A. Pohl, B. Cosenza, and B. Juurlink. Cost modelling for vectorization on arm. In 2018 IEEE International Conference on Cluster Computing (CLUSTER), pages 644–645. IEEE, 2018. [18] A. Pohl, B. Cosenza, and B. Juurlink. Portable cost modeling for auto-vectorizers. In 2019 IEEE 27th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS), pages 359–369. IEEE, 2019. [19] A. Pohl, B. Cosenza, and B. Juurlink. Vectorization cost modeling for neon, avx and sve. Performance Evaluation, 140:102106, 2020. [20] V. Porpodas. Supergraph-slp auto-vectorization. In 2017 26th International Conference on Parallel Architectures and Compilation Techniques (PACT), pages 330–342. IEEE, 2017. [21] V. Porpodas, A. Magni, and T. M. Jones. pslp: padded slp automatic vectorization. In 2015 IEEE/ACM International Symposium on Code Generation and Optimization (CGO), pages 190–201. IEEE, 2015. [22] V. Porpodas, R. C. Rocha, and L. F. Góes. Look-ahead slp: auto-vectorization in the presence of commutative operations. In Proceedings of the 2018 International Symposium on Code Generation and Optimization, pages 163–174, 2018. [23] V. Porpodas, R. C. Rocha, and L. F. Góes. vw-slp: auto-vectorization with adaptive vector width. In Proceedings of the 27th International Conference on Parallel Architectures and Compilation Techniques, pages 1–15, 2018. [24] I. Rosen, D. Nuzman, and A. Zaks. Loop-aware slp in gcc. In GCC Developers Summit, pages 131–142, 2007.	-
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/98248	-
dc.description.abstract	超字階層平行性（Superword-Level Parallelism, SLP）是一種關鍵的向量化技術，用於從循序執行的程式碼中提取資料平行性。然而，其在現代編譯器中的效能，往往受到其底層貪婪式啟發（greedy heuristics）演算法的限制。傳統的由下而上（bottom-up）SLP 向量化器所做的決策是局部且不可逆的，這種方法雖然計算效率高，卻常常因錯失更具全域效益的指令分組機會，而導致次優的向量化結果。本論文旨在針對此一根本性限制，提出一種新穎的混合式搜索算法，該算法能系統性地探索解空間，以在成本模型之上找出真正的最優向量化序列。我們的研究方法將 SLP 儲存指令鏈向量化問題，重新建構成一個狀態空間搜索問題，其目標是找到一條成本效益最高的向量化決策路徑。此方法的核心在於整合 LLVM SandboxIR 框架，此框架提供了一個交易式（transactional）的環境，使我們能夠執行一連串的轉換、評估其對全域成本的影響，並在必要時將其復原。這種能力讓我們得以反覆探索不同的向量化策略，而無需過早地鎖定於次優的選擇。我們所提出的演算法搜索所有可能的向量化序列所構成的狀態空間。此方法的一項關鍵創新，是整合了基於動態規劃（Dynamic Programming, DP）的提早終止條件。這種混合設計能有效修剪搜尋空間的大部分，使尋找全域最優解在多數實際程式碼中成為可行。我們詳細分析了演算法設計、在 LLVM SLP 向量化器中的實作，以及其計算複雜度，並展示其在理論與實務上相較傳統貪婪法的優勢。本研究證明，藉由超越貪婪啟發式並善用 SandboxIR 框架，LLVM 編譯器能實現更高品質的向量化，並在成本模型下釋放更大的效能潛力。	zh_TW
dc.description.abstract	Superword-Level Parallelism (SLP) is a crucial vectorization technique for extracting data-level parallelism from straight-line code, yet its effectiveness in modern compilers is often constrained by its underlying greedy heuristics based on the estimated-based cost model. The traditional bottom-up SLP vectorizer makes localized, irreversible decisions that, while computationally efficient, can lead to sub-optimal vectorization plans by forgoing opportunities for more globally efficient instruction groupings. This thesis addresses this fundamental limitation by proposing a novel hybrid search algorithm that systematically explores the solution space to identify a better vectorization sequence. Our methodology re-frames the SLP store-chain vectorization problem as a state-space search problem, where the goal is to find the most cost-effective path of vectorization decisions. Central to our approach is the integration of the LLVM SandboxIR, a transactional framework that enables us to apply sequences of transformations, evaluate their global cost impact, and subsequently roll them back. This capability allows for an iterative exploration of different vectorization strategies without committing to premature, sub-optimal choices. The proposed algorithm employs a iterative search to traverse the state space of possible vectorization sequences. A key innovation of our approach is the integration of a dynamic programming (DP) based early-exit condition. When the algorithm detects that the remaining un-vectorized operations have no inter-dependencies, it switches from the computationally intensive state-space search to a highly efficient DP routine to solve the remainder of the problem optimally. This hybrid design effectively prunes large portions of the search space, making the quest for a global optimum computationally feasible for a wide range of real-world code. We provide a detailed analysis of the algorithm's design, its implementation within the LLVM SLP vectorizer, and its computational complexity, demonstrating its theoretical and practical advantages over the traditional greedy approach. Our work shows that by moving beyond greedy heuristics and leveraging SandboxIR frameworks, LLVM compiler can achieve a higher quality of vectorization and unlock greater performance on the cost model.	en
dc.description.provenance	Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2025-07-31T16:06:00Z No. of bitstreams: 0	en
dc.description.provenance	Made available in DSpace on 2025-07-31T16:06:00Z (GMT). No. of bitstreams: 0	en
dc.description.tableofcontents	Verification Letter from the Oral Examination Committee i Acknowledgements ii 摘要 iii Abstract v Contents vii List of Figures x List of Tables xi Chapter 1 Introduction 1 1.1 Introduction 1 1.2 Contributions 4 Chapter 2 Background 6 2.1 LLVM 6 2.2 LLVM IR 9 2.3 IRBuilder 11 2.4 LLVM SandboxIR 12 2.5 LLVM SLP Vectorizer 16 2.6 The cost model heuristic of LLVM Auto-vectorizer 19 2.7 The Slice Selection Heuristic 20 2.7.1 How External Uses Create Inter-Slice Dependencies 23 2.7.2 Conclusion: Why a State-Space Search Is Necessary 25 Chapter 3 Design and Implementation 27 3.1 Supporting IR Rollback 27 3.1.1 Instruction Creation 27 3.1.2 Instruction Deletion 29 3.1.3 Instruction Movement 30 3.1.4 Use-Def Information Modifications 31 3.2 Rollback Related States 32 3.3 Applying to the Slice Selection Problem 34 3.3.1 Algorithm Design and Framework 34 3.3.2 Core Data Structures 34 3.3.3 Pseudocode and Implementation Details 35 3.3.4 Complexity Analysis 39 3.3.4.1 Time Complexity 39 3.3.4.2 Space Complexity 42 Chapter 4 Evaluation 44 4.1 Experimental Setup 44 4.2 Cost Reduction 45 4.2.1 SPECCPU 2006 Benchmarks 45 4.2.2 SPECCPU 2017 Benchmarks 47 4.3 Compilation Time 47 4.4 Discussion 49 4.5 Summary 50 Chapter 5 Conclusion 51 5.1 Conclusion 51 5.2 Discussion and Future Work 51 References 53	-
dc.language.iso	en	-
dc.subject	超字階層平行性	zh_TW
dc.subject	LLVM	zh_TW
dc.subject	單指令流多資料流	zh_TW
dc.subject	SandboxIR	zh_TW
dc.subject	自動向量化	zh_TW
dc.subject	Auto-Vectorization	en
dc.subject	SandboxIR	en
dc.subject	SIMD	en
dc.subject	Superword-Level Parallelism	en
dc.subject	LLVM	en
dc.title	整合 LLVM SandboxIR 於超字階層平行性向量化器以支援IR還原機制藉以提升儲存指令鏈向量化效能	zh_TW
dc.title	Supporting IR Rollback Mechanism by Integrating LLVM SandboxIR into Superword-Level Parallelism Vectorizer to Improve the Performance of Store Chain Vectorization	en
dc.type	Thesis	-
dc.date.schoolyear	113-2	-
dc.description.degree	碩士	-
dc.contributor.oralexamcommittee	黃敬群;洪鼎詠;李逸元;盧瑞山	zh_TW
dc.contributor.oralexamcommittee	Ching-Chun Huang;Ding-Yong Hong;Yi-Yuan Lee;Ruei-Shan Lu	en
dc.subject.keyword	LLVM,超字階層平行性,自動向量化,SandboxIR,單指令流多資料流,	zh_TW
dc.subject.keyword	LLVM,Superword-Level Parallelism,Auto-Vectorization,SandboxIR,SIMD,	en
dc.relation.page	56	-
dc.identifier.doi	10.6342/NTU202502571	-
dc.rights.note	同意授權(限校園內公開)	-
dc.date.accepted	2025-07-30	-
dc.contributor.author-college	電機資訊學院	-
dc.contributor.author-dept	資訊工程學系	-
dc.date.embargo-lift	2025-08-01	-
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
ntu-113-2.pdf 授權僅限NTU校內IP使用（校園外請利用VPN校外連線服務）	10.16 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。