Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊工程學系
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/4410
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor劉邦鋒(Pangfeng Liu)
dc.contributor.authorChun-Chen Hsuen
dc.contributor.author許俊琛zh_TW
dc.date.accessioned2021-05-14T17:42:06Z-
dc.date.available2015-08-25
dc.date.available2021-05-14T17:42:06Z-
dc.date.copyright2015-08-25
dc.date.issued2015
dc.date.submitted2015-08-19
dc.identifier.citation[1] V. Bala, E. Duesterwald, and S. Banerjia. Dynamo: a transparent dynamic
optimization system. In
Proc. PLDI, pages 1 12, 2000.
[2] J. Lu, H. Chen, P.-C. Yew, and W.-C. Hsu.
Design and implementation
of a lightweight dynamic optimization system.
Journal of Instruction-Level
Parallelism, 6:1 24, 2004.
[3] S. Sridhar, J. S. Shapiro, E. Northup, and P. P. Bungale.
HDTrans:
an
open source, low-level dynamic instrumentation system. In
Proc. VEE, pages
175 185, 2006.
[4] David Hiniker, Kim Hazelwood, and Michael D. Smith.
Improving region
selection in dynamic optimization systems.
In
MICRO 38: Proceedings of
the 38th annual IEEE/ACM International Symposium on Microarchitecture,
pages 141 154, Washington, DC, USA, 2005. IEEE Computer Society. ISBN
0-7695-2440-0. doi: http://dx.doi.org/10.1109/MICRO.2005.22.
[5] F. Qin, C. Wang, Z. Li, H.-S. Kim, Y. Zhou, and Y. Wu.
LIFT: A low-
overhead practical information ow tracking system for detecting security
attacks. In
Proc. Annual Microarchitecture Symposium, pages 135 148, 2006.
[6] James E. Smith and Ravi Nair.
Virtual Machines: Versatile Platforms for
Systems and Processes.
Morgan Kaufman, 2005.
[7] Vasanth Bala, Evelyn Duesterwald, and Sanjeev Banerjia. Dynamo: a trans-
parent dynamic optimization system. In
PLDI '00: Proceedings of the ACM
SIGPLAN 2000 conference on Programming language design and implemen-
tation, pages 1 12, New York, NY, USA, 2000. ACM. ISBN 1-58113-199-2.
doi: http://doi.acm.org/10.1145/349299.349303.
[8] Derek Bruening.
E cient, Transparent, and Comprehensive Runtime Code
Manipulation.
Ph.d. thesis, Massachusetts Institute of Technology, Cam-
bridge, MA, Sep 2004.
[9] Chi-Keung Luk, Robert Cohn, Robert Muth, Harish Patil, Artur Klauser,
Geo Lowney, Steven Wallace, Vijay Janapa Reddi, and Kim Hazelwood.
Pin: building customized program analysis tools with dynamic instrumen-
tation.
In
PLDI '05: Proceedings of the 2005 ACM SIGPLAN conference
on Programming language design and implementation, pages 190 200, New
York, NY, USA, 2005. ACM. ISBN 1-59593-056-6. doi: http://doi.acm.org/
10.1145/1065010.1065034.
[10] Nicholas Nethercote.
Dynamic Binary Analysis and Instrumentation.
A dis-
sertation submitted for the degree of doctor of philosophy, University of Cam-
bridge, November 2004. URL
http://valgrind.org/docs/phd2004.pdf.
[11] K. Scott, N. Kumar, S. Velusamy, B. Childers, J. W. Davidson, and M. L.
So a. Retargetable and recon gurable software dynamic translation. In
CGO
'03: Proceedings of the international symposium on Code generation and opti-
mization, pages 36 47, Washington, DC, USA, 2003. IEEE Computer Society.
ISBN 0-7695-1913-X.
[12] K. Scott, N. Kumar, B.R. Childers, J.W. Davidson, and M.L. So a. Overhead
reduction techniques for software dynamic translation. In
Parallel and Dis-
tributed Processing Symposium, 2004. Proceedings. 18th International, pages
200 , April 2004. doi: 10.1109/IPDPS.2004.1303224.
[13] Qin Zhao, Derek Bruening, and Saman Amarasinghe. Umbra: e cient and
scalable memory shadowing.
In
CGO '10: Proceedings of the 8th annual
IEEE/ACM international symposium on Code generation and optimization,
pages 22 31, New York, NY, USA, 2010. ACM. ISBN 978-1-60558-635-9. doi:
http://doi.acm.org/10.1145/1772954.1772960.
[14] Anton Cherno , Mark Herdeg, Ray Hookway, Chris Reeve, Norman Rubin,
Tony Tye, S. Bharadwaj Yadavalli, and John Yates. Fx!32: A pro le-directed
binary translator.
IEEE Micro,
18(2):56 64, 1998.
ISSN 0272-1732.
doi:
http://dx.doi.org/10.1109/40.671403.
[15] Raymond J. Hookway and Mark A. Herdeg. Digital fx!32: combining emu-
lation and binary translation.
Digital Tech. J.,
9(1):3 12, 1997. ISSN 0898-
901X.
[16] L. Baraz, T. Devor, O. Etzion, S. Goldenberg, A. Skaletsky, Yun Wang, and
Y. Zemach. Ia-32 execution layer: a two-phase dynamic translator designed to
support ia-32 applications on itanium
R -based systems. In
Microarchitecture,
2003. MICRO-36. Proceedings. 36th Annual IEEE/ACM International Sym-
posium on, pages 191 201, Dec. 2003. doi: 10.1109/MICRO.2003.1253195.
[17] Cristina Cifuentes, Brian Lewis, and David Ung. Walkabout - a retargetable
dynamic binary translation framework. In
In Proceedings of the 2002 Work-
shop on Binary Translation, 2002.
[18] Giuseppe Desoli, Nikolay Mateev, Evelyn Duesterwald, Paolo Faraboschi, and
Joseph A. Fisher. Deli: a new run-time control point. In
MICRO 35: Pro-
ceedings of the 35th annual ACM/IEEE international symposium on Microar-
chitecture, pages 257 268, Los Alamitos, CA, USA, 2002. IEEE Computer
Society Press. ISBN 0-7695-1859-1.
[19] Kemal Ebcioglu, Erik Altman, Michael Gschwind, and Sumedh Sathaye. Dy-
namic binary translation and optimization.
IEEE Transactions on Comput-
ers, 50(6):529 548, 2001.
[20] Jianjun Li, Chenggang Wu, and Wei-Chung Hsu.
An evaluation of mis-
aligned data access handling mechanisms in dynamic binary translation
systems.
In
CGO '09: Proceedings of the 2009 International Symposium
on Code Generation and Optimization, pages 180 189, Washington, DC,
USA, 2009. IEEE Computer Society.
ISBN 978-0-7695-3576-0.
doi:
http:
//dx.doi.org/10.1109/CGO.2009.22.
[21] David Ung and Cristina Cifuentes. Machine-adaptable dynamic binary trans-
lation.
SIGPLAN Not., 35(7):41 51, 2000.
doi: http://doi.acm.org/10.1145/
351403.351414.
[22] David Ung and Cristina Cifuentes. Dynamic binary translation using run-time
feedbacks.
Sci. Comput. Program., 60(2):189 204, 2006. ISSN 0167-6423. doi:
http://dx.doi.org/10.1016/j.scico.2005.10.005.
[23] Fabrice Bellard. Qemu, a fast and portable dynamic translator. In
USENIX
Annual Technical Conference, FREENIX Track, pages 41 46, 2005.
[24] S. DEVINE, E. BUGNION, and M. ROSENBLUM.
Virtualization system
including a virtual machine monitor for a computer with a segmented archi-
tecture. United States Patent 6,397,242.
[25] Chen Yu, Ren Jie, Zhu Hui, and Shi Yuan Chun. Dynamic binary translation
and optimization in a whole-system emulator -skyeye. In
Parallel Processing
Workshops, 2006. ICPP 2006 Workshops. 2006 International Conference on,
pages 8 pp. 336, 0-0 2006. doi: 10.1109/ICPPW.2006.32.
[26] Matthew
Chapman,
Daniel
J.
Magenheimer,
and
Parthasarathy
Ran-
ganathan.
Magixen:
Combining binary translation and virtualization.
http://www.hpl.hp.com/techreports/2007/HPL-2007-77.html, 2007.
[27] Z. Wang, R. Liu, Y. Chen, X. Wu, H. Chen, W. Zhang, and B. Zang.
COREMU: a scalable and portable parallel full-system emulator.
In
Proc.
PPoPP, 2011.
[28] J.-H. Ding, Y.-C. Chung, P.-C. Chang, and W.-C. Hsu. PQEMU: A parallel
system emulator based on QEMU. In
1st International QEMU Users Forum,
2011.
[29] Chris Lattner and Vikram Adve. Llvm: A compilation framework for lifelong
program analysis & transformation. In
CGO '04: Proceedings of the interna-
tional symposium on Code generation and optimization, page 75, Washington,
DC, USA, 2004. IEEE Computer Society. ISBN 0-7695-2102-9.
[30] I. Bohm, T.J.K. Edler von Koch, S.C. Kyle, B. Franke, and N. Topham.
Generalized just-in-time trace compilation using a parallel task farm in a
dynamic binary translator. In
Proc. PLDI, 2011.
[31] J. Ha, M. R. Haghighat, S. Cong, and K. S. McKinley. A concurrent trace-
based just-in-time compiler for javascript. In
Workshop on Parallel Execution
of Sequential Programs on Multi-core Architectures, 2009.
[32] Andreas Gal, Brendan Eich, Mike Shaver, David Anderson, David Mandelin,
Mohammad R. Haghighat, Blake Kaplan, Graydon Hoare, Boris Zbarsky,
Jason Orendor , Jesse Ruderman, Edwin W. Smith, Rick Reitmaier, Michael
Bebenita, Mason Chang, and Michael Franz. Trace-based just-in-time type
specialization for dynamic languages. In
PLDI, pages 465 478, 2009.
[33] D. Merrill and K. Hazelwood. Trace fragment selection within method-based
jvms. In
Proceedings of the 4th ACM SIGPLAN/SIGOPS international con-
ference on Virtual execution environments, pages 41 50, 2008.
[34] H. Inoue, H. Hayashizaki, P. Wu, and T. Nakatani.
A trace-based java jit
compiler retro tted from a method-based compiler. In
CGO'11,
pages 246
256, 2011.
[35] H. Hayashizaki, P. Wu, H. Inoue, M. J. Serrano, and T. Nakatani. Improving
the performance of trace-based systems by false loop ltering. In
ASPLOS,
pages 405 418, 2011.
[36] Toshio Suganuma, Toshiaki Yasue, and Toshio Nakatani.
A region-based
compilation technique for a java just-in-time compiler. In
PLDI '03,
pages
312 323. ACM, 2003. ISBN 1-58113-662-5. doi: 10.1145/781131.781166. URL
http://doi.acm.org/10.1145/781131.781166.
[37] QEMU. http://qemu.org.
[38] Low Level Virtual Machine (LLVM). http://llvm.org.
[39] Apala Guha, Kim hazelwood, and Mary Lou So a.
Dbt path selection for
holistic memory e ciency and performance.
SIGPLAN Not.,
45(7):145 156,
2010. doi: http://doi.acm.org/10.1145/1837854.1736018.
[40] Tiny
Code
Generator
(TCG)
Documentation.
http://wiki.qemu.org/Documentation/TCG.
[41] Aashish Phansalkar, Ajay Joshi, and Lizy K. John. Analysis of redundancy
and application balance in the spec cpu2006 benchmark suite.
SIGARCH
Comput. Archit. News,
35:412 423,
June 2007.
ISSN 0163-5964.
doi:
http://doi.acm.org/10.1145/1273440.1250713.
URL
http://doi.acm.org/
10.1145/1273440.1250713.
[42] Chun-Chen Hsu, Pangfeng Liu, Chien-Min Wang, Jan-Jan Wu, Ding-Yong
Hong, Pen-Chung Yew, and Wei-Chung Hsu. Lnq: Building high performance
dynamic binary translators with existing compiler backends. In
ICPP, pages
226 234, 2011.
[43] M. M. Michael and M. L. Scott. Simple, fast, and practical non-blocking and
blocking concurrent queue algorithms. In
15th Annual ACM Symposium on
Principles of Distributed Computing, 1996.
[44] Vijay Sundaresan, Daryl Maier, Pramod Ramarao, and Mark Stoodley. Expe-
riences with multi-threading and dynamic class loading in a java just-in-time
compiler.
In
CGO '06,
pages 87 97, Washington, DC, USA, 2006. IEEE
Computer Society.
ISBN 0-7695-2499-0.
doi: 10.1109/CGO.2006.16.
URL
http://dx.doi.org/10.1109/CGO.2006.16.
[45] Wen-Mei W. Hwu, Scott A. Mahlke, William Y. Chen, Pohua P. Chang,
Nancy J. Warter, Roger A. Bringmann, Roland G. Ouellette, Richard E.
Hank, Tokuzo Kiyohara, Grant E. Haab, John G. Holm, and Daniel M. Lav-
ery. The superblock: an e ective technique for vliw and superscalar compi-
lation.
J. Supercomput.,
7(1-2):229 248, May 1993.
ISSN 0920-8542.
doi:
10.1007/BF01205185. URL
http://dx.doi.org/10.1007/BF01205185.
[46] perfmon. perfmon2. http://perfmon2.sourceforge.net.
[47] unstrip
tool.
unstrip
tool
in
dynamic
instrumentation
library.
http://www.paradyn.org/html/tools/unstrip.html.
[48] Vitaly Chipounov and George Candea. Dynamically Translating x86 to LLVM
using QEMU. Technical report, 2010.
[49] Ding-Yong Hong, Chun-Chen Hsu, Pangfeng Liu, Chien-Min Wang, Jan-Jan
Wu, , Pen-Chung Yew, and Wei-Chung Hsu. Hqemu: A multi-threaded and
retargetable dynamic binary translator on multicores.
In
CGO '12: Pro-
ceedings of the 10th annual IEEE/ACM international symposium on Code
generation and optimization, 2012.
[50] J. Ha, M.R. Haghighat, S. Cong, and K.S. McKinley.
A concurrent trace-
based just-in-time compiler for single-threaded javascript.
In
Workshop on
Parallel Execution of Sequential Programs on Multicore Architectures, 2009.
[51] James C. Dehnert, Brian K. Grant, John P. Banning, Richard Johnson,
Thomas Kistler, Alexander Klaiber, and Jim Mattson. The transmeta code
TM
morphing
software: using speculation, recovery, and adaptive retranslation
to address real-life challenges. In
CGO '03: Proceedings of the international
symposium on Code generation and optimization, pages 15 24, Washington,
DC, USA, 2003. IEEE Computer Society. ISBN 0-7695-1913-X.
[52] Jiwei Lu, Howard Chen, Pen-Chung Yew, and Wei chung Hsu. Design and
implementation of a lightweight dynamic optimization system.
Journal of
Instruction-Level Parallelism, 6:2004, 2004.
[53] Cheng Wang, Shiliang Hu, Ho-Seop Kim, Sreekumar R. Nair, Mauricio Bre-
ternitz Jr., Zhiwei Ying, and Youfeng Wu.
Stardbt:
An e cient multi-
platform dynamic binary translation system. In
ACSAC'07, pages 4 15, 2007.
[54] Peng Wu, Hiroshige Hayashizaki, Hiroshi Inoue, and Toshio Nakatani. Re-
ducing trace selection footprint for large-scale java applications without per-
formance loss.
In
OOPSLA '11,
pages 789 804, New York, NY, USA,
2011. ACM. ISBN 978-1-4503-0940-0. doi: 10.1145/2048066.2048127. URL
http://doi.acm.org/10.1145/2048066.2048127.
[55] Derek Davis and Kim Hazelwood. Improving region selection through loop
completion. In
ASPLOS Workshop on Runtime Environments/Systems, Lay-
ering, and Virtualized Environments, RESoLVE, Newport Beach, CA, March
2011.
[56] Chengyan Zhao, Youfeng Wu, J. Gregory Ste an, and Cristiana Amza.
Lengthening traces to improve opportunities for dynamic optimization.
In
Proceedings of the Workshop on Interaction between Compilers and Computer
Architectures, 2008.
[57] Michael Paleczny, Christopher Vick, and Cli Click. The java hotspot(tm)
server compiler. In
In USENIX Java Virtual Machine Research and Technol-
ogy Symposium , pages 1 12, 2001.
[58] Hotspot parallel collector. In
Memory Management in the Java HotSpot Vir-
tual Machine Whitepaper.
[59] Michael Bebenita, Florian Brandner, Manuel Fahndrich, Francesco Logozzo,
Wolfram Schulte, Nikolai Tillmann, and Herman Venter. Spur: a trace-based
jit compiler for cil.
SIGPLAN Not.,
45:708 725, October 2010. ISSN 0362-
1340. doi: http://doi.acm.org/10.1145/1932682.1869517. URL
http://doi.
acm.org/10.1145/1932682.1869517.
[60] Hiroshi Inoue, Hiroshige Hayashizaki, Peng Wu, and Toshio Nakatani. Adap-
tive multi-level compilation in a trace-based java jit compiler. In
Proceedings
of the ACM international conference on Object oriented programming systems
languages and applications, OOPSLA '12, pages 179 194, New York, NY,
USA, 2012. ACM. ISBN 978-1-4503-1561-6. doi: 10.1145/2384616.2384630.
URL
http://doi.acm.org/10.1145/2384616.2384630.
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/4410-
dc.description.abstract動態二元碼翻譯是虛擬化技術的核心技術因為它可用來加速指令集模擬。 對於動態二元碼翻譯器而言,其關鍵在於所翻譯的二元碼的質量, 以及是否可以在程式執行過程找到再優化增進效能的程
式區段。 在這論文中,我們呈現一個可重置的二元碼翻譯系統,此系統可產生高效能的動態二元碼翻譯器。 我們另提出二種執行熱區偵查方法來增加動態二元碼翻譯器的效能。 對一動態二元碼>
翻譯器而言,其翻譯的二元碼對於效能的影響甚為重要。故通常我們會對翻譯碼做手動優化。 然而這樣手動優化過的翻譯器是難以重置到另一系統,因為需要同樣的實作力氣來移植翻譯器到新的>
平台上。此論文首先提出一容易重置的二元碼翻譯系統,稱為 LLVM+QEMU (LnQ), 利用現有的編譯器技術來產生高效能的動態二元碼翻譯器。與 QEMU 相比, LnQ 在 ARM到 x86_64 及 x86 到 ARM
的動態二元碼翻譯器上於SPEC CINT2006中有高於2倍的效能表現。 此外,能否在程式執行過程中偵測出執行程式熱區也是影響效能的一關鍵因素。 在此論文中我們將指出現今熱區偵測方法不好
的地方, 亦即其所偵測的熱區會無法完全執行,如果這樣的熱區很多的話將會導至效能不佳。 我們首先提出測量此一弱點的方法來證明此弱點真的存在。進而我們再提出一改進此弱點的方法。 >
我們在此論文中提出一輕量級的熱區偵測技術稱為「Early-Exit Guided Region Formation (EEG)」。EEG 能持續地尋找出無法完全執行的熱區並將它與其他熱區合併來改進效能。 這方法對於
ARM 到 x86_64 的動態二元碼翻譯器能有效改進之前的方法約23%。對於x86_64 到 ARM的翻譯器約有11% 我們最後提出另一種以程序為單位的熱區偵測方法。並比較EEG與此方法的優劣。
zh_TW
dc.description.abstractDynamic binary translation is one of the core technologies in virtualization to boost
the performace of instruction set architecture (ISA) simulation. The key factors
to the performace of dynamic binary translators are the quality of translated
code and the ability to detect hot regions at runtime.
This dissertation builds
a retargetable dynamic binary translator framework and provides two hot region
detection approaches to improve the performance of dynamic binary translators.
The quality of translated code is critical to the performance of a dynamic binary
translator, which implements the semantics of the
guest ISA instructions with the
host ISA
instructions, so the translated code is often carefully hand-optimized.
However a hand-optimized translator is not retargetabile because it takes tremen-
dous implementation e orts for software engineers to port it to a new host ISA.
This dissertation rst proposes an LLVM+QEMU (LnQ) framework for build-
ing high performance and retargetable binary translators with existing compiler
modules.
The goal of LnQ framework is to enable the process of building high
performance and retargetable dynamic binary translators with existing industry-
strength compiler optimization passes and code generation backends. Compared to
QEMU, the LnQ shows more than 2X speedup in CINT2006 for ARM-to-x86_64
and x86-to-ARM dynamic binary translators compared to QEMU.
Besides the quality of translated code, the ability to detect hot regions of guest
applications also determines the performance of dynamic binary translators. Most
dynamic binary translators target traces, i.e. frequently executed code paths, as
code regions to be translated and optimized. The
Next-Executing-Tail (NET) trace
formation method is an important example of such techniques. Many existing trace
formation schemes are variants of NET.
This dissertation examines the ine ciency of NET-like trace formation algorithms.
We found the formed traces may contain a large number of early exits that could
be branched out during the execution.
If this happens frequently, the program
execution will spend more time in the slow binary interpreter or in the unopti-
mized code regions than in the optimized traces in code cache. The bene t of the
trace optimization is thus lost. Traces with frequently taken early-exits are called
delinquent traces.
This dissertation proposes a light-weight region formation technique called
Early-
Exit Guided Region Formation (EEG) to improve the e ciency of traces.
It itera-
tively identi es and merges delinquent regions into larger code regions. It is shown
the EEG achieves 1.23X and 1.11X speedup in CINT2006 for ARM-to-x86_64 and
x86-to-ARM DBTs compared to NET.
This dissertation also studies the procedure-based dynamic binary translator that
detects hot procedures as its compilation (i.e. translation and optimization) unit.
We compare the performance of our EEG region formation algorithm with proce-
dure region.
en
dc.description.provenanceMade available in DSpace on 2021-05-14T17:42:06Z (GMT). No. of bitstreams: 1
ntu-104-D95922006-1.pdf: 5710028 bytes, checksum: 8c36c6c0b0dac246c1797b4e8cdd2cb9 (MD5)
Previous issue date: 2015
en
dc.description.tableofcontents口試 委員會 審定 書 i
Acknowledgements ii
誌謝 iv
Abstract v
摘要 vii
Contents viii
List of Figures xi
List of Tables xiii
1 Introduction 1
1.1 Retargetability of Dynamic Binary Translators . . . . . . . . . . . . 1
1.2 High Performance of Dynamic Binary Translators . . . . . . . . . . 3
1.2.1 Delinquent Trace . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.2 Solution 1: Early-Exit Guided Region Formation . . . . . . 5
1.2.3 Solution 2: Trace-Guided Procedure-Based Region Formation 61.3
Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.4 Dissertation Organization . . . . . . . . . . . . . . . . . . . . . . . 9
2 The LLVM+QEMU (LnQ) Framework 10
2.1 LnQ: Design and Implementation . . . . . . . . . . . . . . . . . . . 10
2.1.1 LLVM Intermiediate Representation . . . . . . . . . . . . . . 11
2.1.2 IR Library and Instruction Description Table . . . . . . . . 12
2.1.3 LLVM IR Translator . . . . . . . . . . . . . . . . . . . . . . 13
2.1.3.1 Register Mapping . . . . . . . . . . . . . . . . . . . 14
2.1.4 Emulation Module . . . . . . . . . . . . . . . . . . . . . . . 16
2.2 Runtime Optimizations . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.2.1 Block Linking . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.2.2 Indirect Branch Target Caching . . . . . . . . . . . . . . . . 19
2.2.3 Shadow Stack . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.3 Performance Evaluation . . . . . . . . . . . . . . . . . . . . . . . . 22
2.3.1 Experiment Settings . . . . . . . . . . . . . . . . . . . . . . 24
2.3.2 Performance of LnQ . . . . . . . . . . . . . . . . . . . . . . 24
2.3.3 Performance of LLVM Just-In-Time Compiler . . . . . . . . 25
2.3.3.1 Execution Time Spent in Code Cache . . . . . . . 26
2.3.3.2 Translation Overhead . . . . . . . . . . . . . . . . 27
2.3.4 Optimization Effects of Runtime Optimization . . . . . . . . 28
2.3.5 Slowdown of LnQ Compared to Native Run . . . . . . . . . 30
2.3.6 Performance of ARM-to-IA32 LnQ . . . . . . . . . . . . . . 32
2.3.7 Performance of IA32-to-ARM LnQ . . . . . . . . . . . . . . 32
2.4 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3 The Early-Exit Guided Code Region Formation 35
3.1 Region-Based Multi-threaded Dynamic Binary Translator . . . . . . 35
3.2 Early Exit Index and Early-Exit Guided Region Formation . . . . 38
3.2.1 Trace Formation Algorithm . . . . . . . . . . . . . . . . . . 38
3.2.2 Early Exit Index . . . . . . . . . . . . . . . . . . . . . . . . 39
3.2.3 Early-Exit Guided Region Formation . . . . . . . . . . . . . 40
3.2.4 Spill Index of a Region . . . . . . . . . . . . . . . . . . . . . 41
3.2.5 Region Versus Trace . . . . . . . . . . . . . . . . . . . . . . 42
3.3 Performance Evaluation . . . . . . . . . . . . . . . . . . . . . . . . 44
3.3.1 Performance Results of SPEC CPU2006 . . . . . . . . . . . 46
3.3.1.1 Performance of NET* . . . . . . . . . . . . . . . . 46
3.3.1.2 Performance of EEG Region Formation . . . . . . . 47
3.3.2 Early Exit Index . . . . . . . . . . . . . . . . . . . . . . . . 49
3.3.3 Performance Profiles of EEG . . . . . . . . . . . . . . . . . . 52
3.3.4 Effect of The Threshold of Spill Index . . . . . . . . . . . . 52
3.3.5 Statistics of Selected Traces and Regions . . . . . . . . . . . 54
3.3.6 Performance Comparison to Native Execution . . . . . . . . 58
3.3.7 Performance of EEG on IA32-to-ARM LnQ . . . . . . . . . 58
3.4 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . 58
4 Trace-Guided Procedure-Based Code Region Formation 61
4.1 Architecture of Dynamic Binary Translator . . . . . . . . . . . . . . 61
4.2 Procedure Compilation in Dynamic Binary Translation . . . . . . . 65
4.2.1 Call-Return Problem . . . . . . . . . . . . . . . . . . . . . . 66
4.2.2 Code/Data Distinction . . . . . . . . . . . . . . . . . . . . . 68
4.2.3 Targets of Indirect Branch . . . . . . . . . . . . . . . . . . . 69
4.3 Performance Evaluation . . . . . . . . . . . . . . . . . . . . . . . . 70
4.3.1 Experimental Environment and Settings . . . . . . . . . . . 70
4.3.2 Experimental Results . . . . . . . . . . . . . . . . . . . . . . 72
4.3.2.1 Overview of the Performance . . . . . . . . . . . . 72
4.3.2.2 Detailed Performance of Procedure-Based DBT . . 73
4.4 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . 73
5 Related Works 76
5.1 Related Works of Dynamic Binary Translation (DBT) . . . . . . . . 76
5.1.1 QEMU . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
5.1.2 Retargetable Dynamic Binary Translators . . . . . . . . . . 77
5.1.3 Other Dynamic Binary Translators . . . . . . . . . . . . . . 79
5.2 Related Works of Region Formation . . . . . . . . . . . . . . . . . . 81
5.2.1 NET . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
5.2.2 Most Recently Executed Tail2 . . . . . . . . . . . . . . . . . 81
5.2.3 NETPlus . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
5.2.4 Last-Executed Iteration (LEI) . . . . . . . . . . . . . . . . . 82
5.3 Related Works of Language Virtual Machines . . . . . . . . . . . . 83
5.3.1 Method-Based Language Virtual Machines . . . . . . . . . . 83
5.3.2 Trace-Based Language Virtual Machines . . . . . . . . . . . 85
6 Conclusion and Future Works 86
6.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
6.2 Future Research Direction . . . . . . . . . . . . . . . . . . . . . . . 88
Bibliography 89
dc.language.isoen
dc.subject二元碼翻譯zh_TW
dc.subject虛擬化zh_TW
dc.subjectregion formationen
dc.subjectbinary translatoren
dc.subjectQEMUen
dc.subjectjust-in-time compilationen
dc.title可重定目標且高效能之動態二元碼轉譯器框架系統zh_TW
dc.titleRetargetable and Effi cient Dynamic Binary Translation Frameworken
dc.typeThesis
dc.date.schoolyear103-2
dc.description.degree博士
dc.contributor.coadvisor吳真貞(Jan-Jan Wu)
dc.contributor.oralexamcommittee郭大維(Tei-Wei Kuo),楊佳玲(Chia-Lin Yang),徐慰中(Wei-Chung Hsu),羅習五(Shi-Wu Lo)
dc.subject.keyword虛擬化,二元碼翻譯,zh_TW
dc.subject.keywordbinary translator,region formation,just-in-time compilation,QEMU,en
dc.relation.page96
dc.rights.note同意授權(全球公開)
dc.date.accepted2015-08-19
dc.contributor.author-college電機資訊學院zh_TW
dc.contributor.author-dept資訊工程學研究所zh_TW
顯示於系所單位:資訊工程學系

文件中的檔案:
檔案 大小格式 
ntu-104-1.pdf5.58 MBAdobe PDF檢視/開啟
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved