Skip navigation

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets

Learn More
DSpace logo
English
中文
  • Browse
    • Communities
      & Collections
    • Publication Year
    • Author
    • Title
    • Subject
    • Advisor
  • Search TDR
  • Rights Q&A
    • My Page
    • Receive email
      updates
    • Edit Profile
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊網路與多媒體研究所
Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/725
Title: 基於進階HQEMU之動態二進制碼向量化
Dynamic Binary Vectorization in Enhanced HQEMU
Authors: Chih-Min Lin
林致民
Advisor: 徐慰中
Keyword: 動態二進制碼轉譯,虛擬暫存器推廣化,單指令多資料流,向量處理,自動向量化,
Dynamic Binary Translation,Virtual Register Promotion,SIMD,Vector,Auto Vectorization,
Publication Year : 2019
Degree: 碩士
Abstract: 在平行處理中,自動向量化技術已被編譯器用來善用資料層級的平行化。然而,因為處理器架構一直在提升向量處理(Vector)或單指令多資料流(SIMD)的能力,所以舊的應用程式沒辦法善用新的Vector/SIMD的能力。例如舊的ARMv7二進制執行檔不能從ARMv8的雙精度浮點運算的SIMD得到好處,舊的x86二進制執行檔沒辦法享受AVX-512的新功能。
在這篇論文中,我們探討在跨指令集架構動態二進制碼轉譯器的基礎問題,該如何將非向量化的迴圈轉換成Vector/SIMD的形式,使得應用程式在新的處理器上可以獲得更高的計算輸出量。核心概念是從這些應用程式的二進制檔還原重要的迴圈資訊,使得迴圈可以被自動向量化。實驗結果顯示,對於不同的benchmark,我們的方法可以在ARMv 到ARMv8的動態二進制碼轉譯中相較於ARMv7 Native獲得1.42倍的效能提升。
Auto vectorization techniques have been adopted by compilers to exploit data-level parallelism in parallel processing for decades. However, since processor architectures have kept enhancing with new features to improve vector/SIMD performance, legacy application binaries failed to fully exploit new vector/SIMD capabilities in modern architectures. For example, legacy ARMv7 binaries cannot benefit from ARMv8 SIMD double precision capability, and legacy x86 binaries cannot enjoy the power of AVX-512 extensions.
In this thesis, we study the fundamental issues involved in cross-ISA Dynamic Binary Translation (DBT) to convert non-vectorized loops to vector/SIMD forms to achieve greater computation throughput available in newer processor architectures. The key idea is to recover critical loop information from those application binaries in order to carry out vectorization at runtime. Experiment results show that our approach achieves an average speedup of 1.42x compared to ARMv7 native run across various benchmarks in an ARMv7-to-ARMv8 dynamic binary translation system.
URI: http://tdr.lib.ntu.edu.tw/handle/123456789/725
DOI: 10.6342/NTU201902129
Fulltext Rights: 同意授權(全球公開)
Appears in Collections:資訊網路與多媒體研究所

Files in This Item:
File SizeFormat 
ntu-108-1.pdf858.07 kBAdobe PDFView/Open
Show full item record


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved