Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 電機工程學系
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/28912
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor陳少傑(Sao-Jie Chen)
dc.contributor.authorCheng-Cho Jeanen
dc.contributor.author簡振哲zh_TW
dc.date.accessioned2021-06-13T00:29:12Z-
dc.date.available2007-07-27
dc.date.copyright2007-07-27
dc.date.issued2007
dc.date.submitted2007-07-24
dc.identifier.citation[1] A. V. Aho, R. Sethi, and J. D. Ullman, Compilers, Principles, Techniques, Tools, Addison-Wesley, 1986.
[2] K.N. King, C Programming: a Modern Approach, W. W. Norton & Company, 1996.
[3] H.M. Deitel, C++ How to Program, Prentice Hall, 2000.
[4] W. Qin, “Porting LCC to the PLX Architecture,” Princeton, 2002.
[5] GNU Compiler Collection, URL: http://gcc.gnu.org/
[6] R. Wilson, R. French, C. Wilson, S. Amarasinghe, J. Anderson, S. Tjiang, S.-W. Liao, C.-W. Tseng, M. Hall, M. Lam, and J. Hennessy, “SUIF: An Infrastructure for Research on Parallelizing and Optimizing Compilers,” ACM SIGPLAN Notices, vol. 29, no. 12, pp. 31-37, Dec. 1994.
[7] G. Aigner, A. Diwan, D. L. Heine, M. S. Lam, D. L. Moore, B. R. Murphy, and C. Sapuntzakis, “An Overview of the SUIF2 Compiler Infrastructure,” Documentation of the Computer Systems Laboratory, Stanford University, May 1999.
[8] G. Aigner, A. Diwan, D. L. Heine, M. S. Lam, D. L. Moore, B. R. Murphy, and C. Sapuntzakis, “The Basic SUIF Programming Guide,” The SUIF Documentation Set, May 1999.
[9] G. Aigner, A. Diwan, D. L. Heine, M. S. Lam, D. L. Moore, B. R. Murphy, and C. Sapuntzakis, “The SUIF Program Representation,” The SUIF Documentation Set, May 1999.
[10] The Stanford SUIF Compiler Group, URL: http://www-suif.stanford.edu/
[11] G. Cheong and M. Lam. “An Optimizer for Multimedia Instruction Set.” The Second SUIF Compiler Workshop, pp.1-12, Aug. 1997.
[12] D. J. DeVries. “A Vectorizing SUIF Compiler: Implementation and Performance,” Master Thesis, University of Toronto, June 1997.
[13] G. Holloway and A. Dimock, “The Machine SUIF Bit “Vector Data Flow
Analysis Library,” The Machine SUIF Documentation Set, July 2002.
[14] G. Holloway and M. D. Smith,“A User's Guide to the Optimization
Programming Interface”, The Machine SUIF Documentation Set, July 2002.
[15] G. Holloway and M. D. Smith, “An Extender's Guide to the Optimization
Programming Interface and Target Descriptions,” The Machine SUIF
Documentation Set, July 2002.
[16] G. Holloway and M. D. Smith, “The Machine SUIF Machine Library,” The
Machine SUIF Documentation Set, July 2002.
[17] G. Holloway and M. D. Smith, “The Machine SUIF SUIFvm Library,” The
Machine SUIF Documentation Set, July 2002.
[18] N. Sreraman and R. Govindarajan, “A Vectorizing Compiler for Multimedia Extensions.” International Journal of Parallel Programming, vol. 28, no. 4, pp 363, Aug. 2000.
[19] A. J. C. Bik, The Software Vectorization Handbook: Applying Multimedia Extensions for Maximum Performance, Intel Press, Hillsboro, OR, 2004.
[20] X. L. Wu, “Development of a Compiler for ARM7TDMI Microprocessors”, Master Thesis, Yuan Ze University, July 2001
[21] R. Kramer, R. Gupta and M. L. Soffa, 'The Combining DAG: A Technique for Parallel Data Flow Analysis,” IEEE Transactions on Parallel and Distributed Systems, vol. 5, no. 8, pp. 805-813, Aug. 1994.
[22] R. B. Lee, “Accelerating Multimedia with Enhanced Microprocessors,” IEEE Micro, vol. 2, no. 26-29, pp.22–32, April 1995.
[23] R. B. Lee, “PLX: A Fully Subword-Parallel Instruction Set Architecture For Fast Scalable Multimedia processing,” IEEE International Conference on Multimedia and Expo (ICME2002), vol. 15, no. 2, pp.117-120, Aug. 2002.
[24] A. Peleg and U. Weiser, “MMX Technology Extension to the Intel Architecture,” IEEE Micro, vol. 16, no. 4, pp. 42–50, Aug. 1996.
[25] M. Tremblay, J. Michael O'Connor, V. Narayanan, and L. He. “VIS Speeds New Media Processing,” IEEE Micro, vol. 16, no. 4, pp. 10–20, Aug. 1996.
[26] R. C. Whaley, A. Petitet, and J. J. Dongarra, “Automated Empirical Optimizations of Software and the ATLAS project,” Parallel Computation, vol. 27, no. 1/2, pp. 3–35, July 2001.
[27] S. Larsen and S. Amarasinghe, “Exploiting Superword Level Parallelism with Multimedia Instruction Sets,” Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, pp. 145-156, May 2000.
[28] S. Larsen, R. Rabbah, and S. Amarasinghe. “Exploiting Vector Parallelism in Software Pipelined Loops,” Proceedings of the 38th International Symposium on Microarchitecture, pp.119-129, Nov. 2005.
[29] S. Larsen, “Compilation techniques for Short-Vector Instructions”, PHD
Thesis, Massachusetts Institute of Technology, April 2006.
[30] J. Shin, M. Hall, and J. Chame, “Superword-Level Parallelism in the Presence of Control Flow,” International Symposium on Code Generation and Optimization, pp. 165-175, Mar. 2005.
[31] A. Darte and F. Vivien, 'A Classification of Nested Loops Parallelization Algorithms,' IEEE Symposium on Emerging Technologies and Factory Automation, pp. 217-234, Oct. 1995.

[32] J. R. Allen, K. Kennedy, C. Porterfield, and J. Warren, 'Conversion of Control Dependence to Data Dependence,' Proceedings of the 10th ACM SIGACT-SIGPLAN Symposium of Programming Language, pp.177-189, Jan. 1983.
[33] G. Ren, P. Wu, and D. Padua, “Optimizing Data Permutations for SIMD Devices,” Proceedings of ACM SIGPLAN Conference on Programming Language Design and Implementation, pp. 118-131, June 2006.
[34] D. Nuzman, I. Rosen, and A. Zaks, 'Auto-Vectorization of Interleaved Data for SIMD,” Proceedings of ACM SIGPLAN Conference on Programming Language Design and Implementation, pp. 132-143, June 2006.
[35] F. Franchetti, S. Kral, J. Lorenz, and C. W. Ueberhuber, 'Efficient Utilization of SIMD Extensions,” Proceedings of the IEEE, vol. 93, no. 2, pp. 409–425, Feb. 2005.
[36] M. E. Wolf and M. S. Lam,'A Loop Transformation Theory and an Algorithm to Maximize Parallelism,' IEEE Transactions on Parallel and Distributed Systems, vol. 2, no. 4, pp. 452-471, Oct. 1991.
[37] Alexandre E. Eichenberger, Peng Wu, and Kevin O'Brien. “Vectorization for
SIMD Architectures with Alignment Constraints.” Proceedings of the SIGPLAN '04 Conference on Programming Language Design and Implementation, pages 82{93, Washington, DC, June 2004.
[38] P. Wu, A.E. Eichenberger and A. Wang, 'Efficient SIMD Code Generation for Runtime Alignment and Length Conversion,' International Symposium on Code Generation and Optimization, pp. 153-164, Mar. 2005.
[39] Peng Wu, Alexandre E. Eichenberger, Amy Wang, and Peng Zhao. “An
Integrated Simdization Framework Using Virtual Vectors”. Proceedings of the 19th ACM International Conference on Supercomputing, pages 169{178, Cambridge, MA, June 2005.
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/28912-
dc.description.abstract多媒體指令集在通用的處理器中是非常普遍的。它們是由一組短向量指令組成的。為了要完全的利用單一指令多重資料(SIMD)架構的效能,我們必須要產生高效率單一指令多重資料的程式,這樣單一指令多重資料的架構下才能完全的發揮其效益。本文主要在開發單一指令多重資料編譯器,有效地利用短向量指令集自動地產生高效率單一指令多重資料的程式之技術。當你做運算時,短向量單一指令多重資料的架構並不支援不連續的記憶體位址。所以,需要把資料重新排列。在此需要解決的問題就是如何把不連貫性的記憶體位址排列成連貫及有效的處理多種不同的資料類型。zh_TW
dc.description.abstractMultimedia extensions are nearly ubiquitous in today’s general-purpose processors. These extensions consist primarily of a set of short vector instructions that operate on same opcode to a vector of operands. To exploit the SIMD capabilities of these architectures, it has prompted the needs for generating efficient simdized codes that SIMD architectures can benefit from.
This Thesis sets out to develop a vector SIMD compiler with techniques that target short vector instructions effectively and automatically. Operations on non-contiguous vector elements are not supported and they require explicit data realigning. Thus, one of the most common aspects of compilation is the effective management of memory alignment and coping with mixed data type.
We identify several new challenges arisen in simdizing multimedia applications, and provide some solutions to these challenges. Simdizing such computation efficiently is therefore an ambitious challenge for compiler designer. We implemented an automatic simdization framework that supports effective simdization in the presence of control flow, memory misalignment, and mixed length conversion.
en
dc.description.provenanceMade available in DSpace on 2021-06-13T00:29:12Z (GMT). No. of bitstreams: 1
ntu-96-R94921126-1.pdf: 1152316 bytes, checksum: cc40133ae428cca168077eea8fcf94fb (MD5)
Previous issue date: 2007
en
dc.description.tableofcontentsABSTRACT i
LIST OF FIGURES vii
LIST OF TABLES ix
CHAPTER 1 INTRODUCTION 1
1.1 Motivation 1
1.2 Objectives 2
1.3 Organization of the Thesis 2
CHAPTER 2 BACKGROUND 3
2.1 Overview of a Basic Compiler 3
2.2 Lexical Analysis 5
2.3 Syntax Analysis 7
2.4 Semantic Analysis 7
2.5 Intermediate Representation 8
2.6 Machine-Independent and Dependents Optimizations 8
2.7 Code Generation 10
2.8 Overview of LCC Compiler Infrastructure 11
2.9 Overview of GCC Compiler Infrastructure 13
2.10 Overview of SUIF Compiler Infrastructure 14
2.11 Overview of Machine SUIF Compiler Infrastructure 19
2.12 Overview of Stream Shift Policy 22
2.12.1 Stream Shift Policy: Zero-Shift 22
2.12.2 Stream Shift Policy: Eager-Shift 23
2.12.3 Stream Shift Policy: Lazy-Shift 23
2.12.4 Stream Shift Policy: Dominant-Shift 24
CHAPTER 3 BACKGROUND ON PARALLELIZATION 25
3.1 Vector Parallelization 25
3.2 Loop Level Parallelization 26
3.3 SIMD Parallelization 27
3.4 Instruction Level Parallelization 27
CHAPTER 4 DEVELOPMENT OF A SIMD COMPILER 29
4.1 PLX Architecture Overview 29
4.2 Simdiziation Framework Overview 33
4.2.1 Control Flow Conversion 35
4.2.2 Loop Unrolling 37
4.2.3 Dataflow Optimization 37
4.2.4 Superword-Level Parallelism 38
4.2.5 Basic Block Level Aggregation 38
4.2.6 Short-Loop Aggregation 38
4.2.7 Loop-Level Aggregation 38
4.2.8 Alignment Devirtualization 39
4.2.8.1 SIMD Hardware Constraints 39
4.2.8.2 Data Reorganization Graph 41
4.2.9 Length Devirtualization 42
4.2.10 SIMD Code Generation 43
4.3 Example of Mixed Sources of SIMD Parallelism 43
4.4 Testing and Validation Methodologies 47
4.4.1 Background 47
4.4.2 Methodologies 48
CHAPTER 5 EXPERIMENTAL RESULTS AND DISCUSSION 51
5.1 Benchmarks 51
5.2 Evaluating Execution Cycles 52
5.3 Performance Gain Evaluation 56
CHAPTER 6 CONCLUSION 61
6.1 Future Work 61
REFERENCES 63
dc.language.isoen
dc.subject多媒體指令集zh_TW
dc.subject編譯器zh_TW
dc.subject單一指令多重資料zh_TW
dc.subjectSUIFen
dc.subjectSIMDen
dc.subjectcompileren
dc.title單一指令多重資料編譯器之開發zh_TW
dc.titleDevelopment of a SIMD Compileren
dc.typeThesis
dc.date.schoolyear95-2
dc.description.degree碩士
dc.contributor.oralexamcommittee蘇培陞,熊博安,王勝德
dc.subject.keyword單一指令多重資料,編譯器,多媒體指令集,zh_TW
dc.subject.keywordSIMD,compiler,SUIF,en
dc.relation.page65
dc.rights.note有償授權
dc.date.accepted2007-07-26
dc.contributor.author-college電機資訊學院zh_TW
dc.contributor.author-dept電機工程學研究所zh_TW
顯示於系所單位:電機工程學系

文件中的檔案:
檔案 大小格式 
ntu-96-1.pdf
  未授權公開取用
1.13 MBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved