Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊網路與多媒體研究所
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/97695
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor廖世偉zh_TW
dc.contributor.advisorShih-Wei Liaoen
dc.contributor.author劉修齊zh_TW
dc.contributor.authorHsiu-Chi Liuen
dc.date.accessioned2025-07-11T16:13:27Z-
dc.date.available2025-07-12-
dc.date.copyright2025-07-11-
dc.date.issued2025-
dc.date.submitted2025-07-02-
dc.identifier.citationN. Adit and A. Sampson. Performance left on the table: an evaluation of compiler autovectorization for risc-v. IEEE Micro, 42(5):41–48, 2022.
F. Bellard. Qemu, a fast and portable dynamic translator. In USENIX annual technical conference, FREENIX Track, volume 41, pages 10–5555. California, USA, 2005.
B. Bramas. Inastemp: A novel intrinsics-as-template library for portable simd-vectorization. Scientific Programming, 2017(1):5482468, 2017.
L. Dagum and R. Menon. Openmp: an industry standard api for shared-memory programming. IEEE computational science and engineering, 5(1):46–55, 1998.
C. Erhardt and D.-I. F. Scheler. Design and implementation of a tricore backend for the llvm compiler framework. Friedrich-Alexander Universität Erlangen-Nürnberg (FAU), 2010.
B. J. Gough and R. Stallman. An Introduction to GCC. Network Theory Limited, 2004.
M.-Y. Hsu. LLVM Techniques, Tips, and Best Practices Clang and Middle-End Libraries: Design powerful and reliable compilers using the latest libraries and tools from LLVM. Packt Publishing Ltd, 2021.
R. F. Ibáñez. Introduction to the risc-v vector extension, 2022.
N. B. Jensen and S. Karlsson. Improving loop dependence analysis. ACM Transactions on Architecture and Code Optimization (TACO), 14(3):1–24, 2017.
S. Larsen and S. Amarasinghe. Exploiting superword level parallelism with multi-media instruction sets. Acm Sigplan Notices, 35(5):145–156, 2000.
C. Lattner and V. Adve. Llvm: A compilation framework for lifelong program analysis & transformation. In International symposium on code generation and optimization, 2004. CGO 2004., pages 75–86. IEEE, 2004.
J. Meijer. Loop vectorization: how good is it? https://llvm.org/devmtg/2024-10/slides/techtalk/Meijer-Loop-Vectorisation.pdf.
D. Naishlos. Autovectorization in gcc. In Proceedings of the 2004 GCC developers summit, pages 105–118. Citeseer, 2004.
L. Organization. Tablegen overview. https://llvm.org/docs/TableGen/.
M. Perotti, M. Cavalcante, N. Wistoff, R. Andri, L. Cavigelli, and L. Benini. A“new ara"for vector computing: An open source highly efficient risc-v v 1.0 vector processor design. In 2022 IEEE 33rd International Conference on Application-specific Systems, Architectures and Processors (ASAP), pages 43–51. IEEE, 2022.
A. Pohl, B. Cosenza, and B. Juurlink. Correlating cost with performance in llvm. Proceedings of the 13th International Summer School on Advanced Computer Architecture and Compilation for High-Performance and Embedded Systems (ACACES), 2017.
G. Rapaport and A. Zaks. introducing vplan to the loop vectorizer. https://llvm.org/devmtg/2017-03//assets/slides/introducing_vplan_to_the_loop_vectorizer.pdf.
X. Tian, H. Saito, E. Su, J. Lin, S. Guggilla, D. Caballero, M. Masten, A. Savonichev, M. Rice, E. Demikhovsky, et al. Llvm compiler implementation for explicit paral-lelization and simd vectorization. In Proceedings of the Fourth Workshop on the LLVM Compiler Infrastructure in HPC, pages 1–11, 2017.
F. d. A. P. Vince Bridgers. llvm ir tutorial - phis, geps and other things, oh my! https://llvm.org/devmtg/2019-04/slides/Tutorial-Bridgers-LLVM_IR_tutorial.pdf.
C. Wei-Ren. Part ii: llvm intermediate representation. https://www.slideshare.net/slideshow/mclinker2013chenwj/64167023, 2013.
A. S. Welker. Determining the Perfect Vectorization Factor. PhD thesis, Saarland University, 2021.
A. Zaks and I. C. Gil Rapaport, Vectorization Team. Vectorizing loops with vplan. https://llvm.org/devmtg/2017-10/slides/Zaks-Vectorizing
35
-
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/97695-
dc.description.abstract近年來,RISC-V 指令集架構因其開源、模組化與可擴展性的特色而備受矚目,各種針對特定應用領域的擴充應運而生,而在現今講求高效能的時代下,SIMD 指令是提升 CPU 效能重要的一環。本論文圍繞在 SIMD 指令與 RISC-V 指令集架構上,旨在探討 LLVM 編譯器專門優化 SIMD 指令之 Loop Vectorization Pass 在 RISC-V 向量擴充 (RISC-V Vector Extension) 的循環優化表現,研究過程中比較了 RISC-V 與 ARM 在成本模型上的差異,辨識出特定程式碼下 RISC-V 損失的向量化機會,並透過實作自定義向量反轉指令 (vrev) 替換了 RISC-V 現有的指令組合,進而改善了現有 LLVM 在 RISC-V 上的優化限制,提供了更多向量化機會。
本研究闡述了在 LLVM 中設計並整合 vrev 指令的過程,包括修改指令選擇與程式碼生成的階段,使 LLVM 能在 RISC-V 架構中有更多優化循環程式碼的機會。為了評估實作的正確性與效能影響,我們也因此修改 QEMU 模擬器來模擬這個指令,將 QEMU 作為實驗平台,執行 LLVM 編譯後包含 vrev 指令的程式碼,並與未使用該指令的版本進行比較。最後使用 SPEC CPU 2006 以及 SPEC CPU 2017 基準測試套件來評估效能,結果顯示,透過 LLVM 支援自訂的 vrev 指令,在一些測試標準中,有了 2% 至 3% 的效能提升,說明了該自訂義指令在 RISC-V 向量擴充中提升效能的潛力。
zh_TW
dc.description.abstractIn recent years, the RISC-V Instruction Set Architecture (ISA) has gained significant attention due to its open-source nature, modularity, and extensibility, giving rise to various domain-specific extension instruction sets. In today's era of high-performance computing, SIMD instructions play a crucial role in enhancing CPU performance.
This thesis focuses on SIMD instructions and the RISC-V architecture, specifically examining the Loop Vectorization Pass in the LLVM compiler, which is dedicated to optimizing SIMD instructions for the RISC-V Vector Extension (RVV). During our research, we compared the cost model between RISC-V and ARM, identified specific code patterns where RISC-V misses vectorization opportunities. To address this problem, we implemented a custom vector reverse instruction (vrev) to improve the existing instruction set of RISC-V. This enhancement improves current optimization limitations in LLVM for RISC-V, providing more vectorization opportunities.
This research focus on the process of designing and integrating the vrev instruction within LLVM, including modifications to the instruction selection and code generation phases, enabling LLVM to better optimize loop code for the RISC-V architecture. To evaluate the correctness and performance impact of our implementation, we modified the QEMU simulator to support this instruction, using it as an experimental platform to execute LLVM-compiled code containing the vrev instruction, comparing it with the default versions. Finally, performance evaluation using the SPEC CPU 2006 and SPEC CPU 2017 benchmark suites showed that supporting the custom vrev instruction natively in LLVM yields performance improvements ranging from 2% to 3% in certain benchmarks, demonstrating the potential performance benefits of this custom instruction in the RISC-V Vector Extension.
en
dc.description.provenanceSubmitted by admin ntu (admin@lib.ntu.edu.tw) on 2025-07-11T16:13:27Z
No. of bitstreams: 0
en
dc.description.provenanceMade available in DSpace on 2025-07-11T16:13:27Z (GMT). No. of bitstreams: 0en
dc.description.tableofcontentsAcknowledgements i
摘要 ii
Abstract iv
Contents vi
List of Figures ix
List of Tables xi
Chapter 1 Introduction 1
1.1 Introduction and Motivation . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Performance Gap Between RISC-V and ARM . . . . . . . . . . . . 2
1.3 Vector Reverse Pattern and Proposed Solution . . . . . . . . . . . . . 3
1.4 Research Objectives and Contributions . . . . . . . . . . . . . . . . 4
Chapter 2 Background 6
2.1 LLVM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.1.1 Loop Vectorization . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.1.2 VPlan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.1.3 Cost Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 TableGen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3 RISC-V Vector Extension . . . . . . . . . . . . . . . . . . . . . . . 11
2.4 ARM NEON and SVE . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.5 LLVM Backend Codegen Pipeline . . . . . . . . . . . . . . . . . . . 13
2.6 QEMU . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Chapter 3 Methodology 15
3.1 Vector Reverse SDNode in SelectionDAG . . . . . . . . . . . . . . . 16
3.2 Designing the Vector Reverse Instruction in LLVM . . . . . . . . . . 16
3.2.1 Instruction Definition using TableGen . . . . . . . . . . . . . . . . 16
3.2.2 Custom SelectionDAG Node and Instruction Lowering Logic . . . . 17
3.2.3 Instruction Selection Pattern Matching . . . . . . . . . . . . . . . . 18
3.2.4 Assembly Printer . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.3 QEMU Implementation for Custom RISC-V Instructions . . . . . . . 20
3.3.1 Implementation of vrev in QEMU . . . . . . . . . . . . . . . . . . 20
3.3.2 Instruction Decoding Definition . . . . . . . . . . . . . . . . . . . 20
3.3.3 TCG Translation Logic . . . . . . . . . . . . . . . . . . . . . . . . 21
3.3.4 Helper Function Definitions . . . . . . . . . . . . . . . . . . . . . 22
3.3.5 Helper Function Implementation . . . . . . . . . . . . . . . . . . . 22
Chapter 4 Evaluation 23
4.1 LLVM Cost Model Analysis . . . . . . . . . . . . . . . . . . . . . . 23
4.2 Dynamic Instruction Count Analysis . . . . . . . . . . . . . . . . . . 26
4.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
Chapter 5 Conclusion 31
References 33
Appendix A — LLVM Backend 36
A.1 LLVM Backend Initialization and Legalization . . . . . . . . . . . . 36
A.2 Instruction Selection and Scheduling . . . . . . . . . . . . . . . . . . 37
A.3 LLVM Backend Code Generation . . . . . . . . . . . . . . . . . . . 37
Appendix B — QEMU Simulation 39
B.1 QEMU Tiny Code Generator . . . . . . . . . . . . . . . . . . . . . . 39
-
dc.language.isoen-
dc.subject向量擴展指令集zh_TW
dc.subjectLLVMzh_TW
dc.subject向量反轉指令zh_TW
dc.subjectRISC-Vzh_TW
dc.subjectQEMUzh_TW
dc.subjectQEMUen
dc.subjectLLVMen
dc.subjectRISC-Ven
dc.subjectVector Extensionen
dc.subjectVector Reverse Instructionen
dc.titleRISC-V 向量擴展指令集在 LLVM 編譯器中的自定義指令實現與最佳化zh_TW
dc.titleCustom Instruction Implementation and Optimization for RISC-V Vector Extension in LLVM Compileren
dc.typeThesis-
dc.date.schoolyear113-2-
dc.description.degree碩士-
dc.contributor.oralexamcommittee黎士瑋;葉春超;洪鼎詠;沈柏曄zh_TW
dc.contributor.oralexamcommitteeShih-Wei Li;Chun-Chao Yeh;Ding-Yong Hong;Bor-Yeh Shenen
dc.subject.keywordLLVM,RISC-V,向量擴展指令集,向量反轉指令,QEMU,zh_TW
dc.subject.keywordLLVM,RISC-V,Vector Extension,Vector Reverse Instruction,QEMU,en
dc.relation.page40-
dc.identifier.doi10.6342/NTU202501420-
dc.rights.note未授權-
dc.date.accepted2025-07-03-
dc.contributor.author-college電機資訊學院-
dc.contributor.author-dept資訊網路與多媒體研究所-
dc.date.embargo-liftN/A-
顯示於系所單位:資訊網路與多媒體研究所

文件中的檔案:
檔案 大小格式 
ntu-113-2.pdf
  未授權公開取用
6.14 MBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved