請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/28912
完整後設資料紀錄
DC 欄位 | 值 | 語言 |
---|---|---|
dc.contributor.advisor | 陳少傑(Sao-Jie Chen) | |
dc.contributor.author | Cheng-Cho Jean | en |
dc.contributor.author | 簡振哲 | zh_TW |
dc.date.accessioned | 2021-06-13T00:29:12Z | - |
dc.date.available | 2007-07-27 | |
dc.date.copyright | 2007-07-27 | |
dc.date.issued | 2007 | |
dc.date.submitted | 2007-07-24 | |
dc.identifier.citation | [1] A. V. Aho, R. Sethi, and J. D. Ullman, Compilers, Principles, Techniques, Tools, Addison-Wesley, 1986.
[2] K.N. King, C Programming: a Modern Approach, W. W. Norton & Company, 1996. [3] H.M. Deitel, C++ How to Program, Prentice Hall, 2000. [4] W. Qin, “Porting LCC to the PLX Architecture,” Princeton, 2002. [5] GNU Compiler Collection, URL: http://gcc.gnu.org/ [6] R. Wilson, R. French, C. Wilson, S. Amarasinghe, J. Anderson, S. Tjiang, S.-W. Liao, C.-W. Tseng, M. Hall, M. Lam, and J. Hennessy, “SUIF: An Infrastructure for Research on Parallelizing and Optimizing Compilers,” ACM SIGPLAN Notices, vol. 29, no. 12, pp. 31-37, Dec. 1994. [7] G. Aigner, A. Diwan, D. L. Heine, M. S. Lam, D. L. Moore, B. R. Murphy, and C. Sapuntzakis, “An Overview of the SUIF2 Compiler Infrastructure,” Documentation of the Computer Systems Laboratory, Stanford University, May 1999. [8] G. Aigner, A. Diwan, D. L. Heine, M. S. Lam, D. L. Moore, B. R. Murphy, and C. Sapuntzakis, “The Basic SUIF Programming Guide,” The SUIF Documentation Set, May 1999. [9] G. Aigner, A. Diwan, D. L. Heine, M. S. Lam, D. L. Moore, B. R. Murphy, and C. Sapuntzakis, “The SUIF Program Representation,” The SUIF Documentation Set, May 1999. [10] The Stanford SUIF Compiler Group, URL: http://www-suif.stanford.edu/ [11] G. Cheong and M. Lam. “An Optimizer for Multimedia Instruction Set.” The Second SUIF Compiler Workshop, pp.1-12, Aug. 1997. [12] D. J. DeVries. “A Vectorizing SUIF Compiler: Implementation and Performance,” Master Thesis, University of Toronto, June 1997. [13] G. Holloway and A. Dimock, “The Machine SUIF Bit “Vector Data Flow Analysis Library,” The Machine SUIF Documentation Set, July 2002. [14] G. Holloway and M. D. Smith,“A User's Guide to the Optimization Programming Interface”, The Machine SUIF Documentation Set, July 2002. [15] G. Holloway and M. D. Smith, “An Extender's Guide to the Optimization Programming Interface and Target Descriptions,” The Machine SUIF Documentation Set, July 2002. [16] G. Holloway and M. D. Smith, “The Machine SUIF Machine Library,” The Machine SUIF Documentation Set, July 2002. [17] G. Holloway and M. D. Smith, “The Machine SUIF SUIFvm Library,” The Machine SUIF Documentation Set, July 2002. [18] N. Sreraman and R. Govindarajan, “A Vectorizing Compiler for Multimedia Extensions.” International Journal of Parallel Programming, vol. 28, no. 4, pp 363, Aug. 2000. [19] A. J. C. Bik, The Software Vectorization Handbook: Applying Multimedia Extensions for Maximum Performance, Intel Press, Hillsboro, OR, 2004. [20] X. L. Wu, “Development of a Compiler for ARM7TDMI Microprocessors”, Master Thesis, Yuan Ze University, July 2001 [21] R. Kramer, R. Gupta and M. L. Soffa, 'The Combining DAG: A Technique for Parallel Data Flow Analysis,” IEEE Transactions on Parallel and Distributed Systems, vol. 5, no. 8, pp. 805-813, Aug. 1994. [22] R. B. Lee, “Accelerating Multimedia with Enhanced Microprocessors,” IEEE Micro, vol. 2, no. 26-29, pp.22–32, April 1995. [23] R. B. Lee, “PLX: A Fully Subword-Parallel Instruction Set Architecture For Fast Scalable Multimedia processing,” IEEE International Conference on Multimedia and Expo (ICME2002), vol. 15, no. 2, pp.117-120, Aug. 2002. [24] A. Peleg and U. Weiser, “MMX Technology Extension to the Intel Architecture,” IEEE Micro, vol. 16, no. 4, pp. 42–50, Aug. 1996. [25] M. Tremblay, J. Michael O'Connor, V. Narayanan, and L. He. “VIS Speeds New Media Processing,” IEEE Micro, vol. 16, no. 4, pp. 10–20, Aug. 1996. [26] R. C. Whaley, A. Petitet, and J. J. Dongarra, “Automated Empirical Optimizations of Software and the ATLAS project,” Parallel Computation, vol. 27, no. 1/2, pp. 3–35, July 2001. [27] S. Larsen and S. Amarasinghe, “Exploiting Superword Level Parallelism with Multimedia Instruction Sets,” Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, pp. 145-156, May 2000. [28] S. Larsen, R. Rabbah, and S. Amarasinghe. “Exploiting Vector Parallelism in Software Pipelined Loops,” Proceedings of the 38th International Symposium on Microarchitecture, pp.119-129, Nov. 2005. [29] S. Larsen, “Compilation techniques for Short-Vector Instructions”, PHD Thesis, Massachusetts Institute of Technology, April 2006. [30] J. Shin, M. Hall, and J. Chame, “Superword-Level Parallelism in the Presence of Control Flow,” International Symposium on Code Generation and Optimization, pp. 165-175, Mar. 2005. [31] A. Darte and F. Vivien, 'A Classification of Nested Loops Parallelization Algorithms,' IEEE Symposium on Emerging Technologies and Factory Automation, pp. 217-234, Oct. 1995. [32] J. R. Allen, K. Kennedy, C. Porterfield, and J. Warren, 'Conversion of Control Dependence to Data Dependence,' Proceedings of the 10th ACM SIGACT-SIGPLAN Symposium of Programming Language, pp.177-189, Jan. 1983. [33] G. Ren, P. Wu, and D. Padua, “Optimizing Data Permutations for SIMD Devices,” Proceedings of ACM SIGPLAN Conference on Programming Language Design and Implementation, pp. 118-131, June 2006. [34] D. Nuzman, I. Rosen, and A. Zaks, 'Auto-Vectorization of Interleaved Data for SIMD,” Proceedings of ACM SIGPLAN Conference on Programming Language Design and Implementation, pp. 132-143, June 2006. [35] F. Franchetti, S. Kral, J. Lorenz, and C. W. Ueberhuber, 'Efficient Utilization of SIMD Extensions,” Proceedings of the IEEE, vol. 93, no. 2, pp. 409–425, Feb. 2005. [36] M. E. Wolf and M. S. Lam,'A Loop Transformation Theory and an Algorithm to Maximize Parallelism,' IEEE Transactions on Parallel and Distributed Systems, vol. 2, no. 4, pp. 452-471, Oct. 1991. [37] Alexandre E. Eichenberger, Peng Wu, and Kevin O'Brien. “Vectorization for SIMD Architectures with Alignment Constraints.” Proceedings of the SIGPLAN '04 Conference on Programming Language Design and Implementation, pages 82{93, Washington, DC, June 2004. [38] P. Wu, A.E. Eichenberger and A. Wang, 'Efficient SIMD Code Generation for Runtime Alignment and Length Conversion,' International Symposium on Code Generation and Optimization, pp. 153-164, Mar. 2005. [39] Peng Wu, Alexandre E. Eichenberger, Amy Wang, and Peng Zhao. “An Integrated Simdization Framework Using Virtual Vectors”. Proceedings of the 19th ACM International Conference on Supercomputing, pages 169{178, Cambridge, MA, June 2005. | |
dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/28912 | - |
dc.description.abstract | 多媒體指令集在通用的處理器中是非常普遍的。它們是由一組短向量指令組成的。為了要完全的利用單一指令多重資料(SIMD)架構的效能,我們必須要產生高效率單一指令多重資料的程式,這樣單一指令多重資料的架構下才能完全的發揮其效益。本文主要在開發單一指令多重資料編譯器,有效地利用短向量指令集自動地產生高效率單一指令多重資料的程式之技術。當你做運算時,短向量單一指令多重資料的架構並不支援不連續的記憶體位址。所以,需要把資料重新排列。在此需要解決的問題就是如何把不連貫性的記憶體位址排列成連貫及有效的處理多種不同的資料類型。 | zh_TW |
dc.description.abstract | Multimedia extensions are nearly ubiquitous in today’s general-purpose processors. These extensions consist primarily of a set of short vector instructions that operate on same opcode to a vector of operands. To exploit the SIMD capabilities of these architectures, it has prompted the needs for generating efficient simdized codes that SIMD architectures can benefit from.
This Thesis sets out to develop a vector SIMD compiler with techniques that target short vector instructions effectively and automatically. Operations on non-contiguous vector elements are not supported and they require explicit data realigning. Thus, one of the most common aspects of compilation is the effective management of memory alignment and coping with mixed data type. We identify several new challenges arisen in simdizing multimedia applications, and provide some solutions to these challenges. Simdizing such computation efficiently is therefore an ambitious challenge for compiler designer. We implemented an automatic simdization framework that supports effective simdization in the presence of control flow, memory misalignment, and mixed length conversion. | en |
dc.description.provenance | Made available in DSpace on 2021-06-13T00:29:12Z (GMT). No. of bitstreams: 1 ntu-96-R94921126-1.pdf: 1152316 bytes, checksum: cc40133ae428cca168077eea8fcf94fb (MD5) Previous issue date: 2007 | en |
dc.description.tableofcontents | ABSTRACT i
LIST OF FIGURES vii LIST OF TABLES ix CHAPTER 1 INTRODUCTION 1 1.1 Motivation 1 1.2 Objectives 2 1.3 Organization of the Thesis 2 CHAPTER 2 BACKGROUND 3 2.1 Overview of a Basic Compiler 3 2.2 Lexical Analysis 5 2.3 Syntax Analysis 7 2.4 Semantic Analysis 7 2.5 Intermediate Representation 8 2.6 Machine-Independent and Dependents Optimizations 8 2.7 Code Generation 10 2.8 Overview of LCC Compiler Infrastructure 11 2.9 Overview of GCC Compiler Infrastructure 13 2.10 Overview of SUIF Compiler Infrastructure 14 2.11 Overview of Machine SUIF Compiler Infrastructure 19 2.12 Overview of Stream Shift Policy 22 2.12.1 Stream Shift Policy: Zero-Shift 22 2.12.2 Stream Shift Policy: Eager-Shift 23 2.12.3 Stream Shift Policy: Lazy-Shift 23 2.12.4 Stream Shift Policy: Dominant-Shift 24 CHAPTER 3 BACKGROUND ON PARALLELIZATION 25 3.1 Vector Parallelization 25 3.2 Loop Level Parallelization 26 3.3 SIMD Parallelization 27 3.4 Instruction Level Parallelization 27 CHAPTER 4 DEVELOPMENT OF A SIMD COMPILER 29 4.1 PLX Architecture Overview 29 4.2 Simdiziation Framework Overview 33 4.2.1 Control Flow Conversion 35 4.2.2 Loop Unrolling 37 4.2.3 Dataflow Optimization 37 4.2.4 Superword-Level Parallelism 38 4.2.5 Basic Block Level Aggregation 38 4.2.6 Short-Loop Aggregation 38 4.2.7 Loop-Level Aggregation 38 4.2.8 Alignment Devirtualization 39 4.2.8.1 SIMD Hardware Constraints 39 4.2.8.2 Data Reorganization Graph 41 4.2.9 Length Devirtualization 42 4.2.10 SIMD Code Generation 43 4.3 Example of Mixed Sources of SIMD Parallelism 43 4.4 Testing and Validation Methodologies 47 4.4.1 Background 47 4.4.2 Methodologies 48 CHAPTER 5 EXPERIMENTAL RESULTS AND DISCUSSION 51 5.1 Benchmarks 51 5.2 Evaluating Execution Cycles 52 5.3 Performance Gain Evaluation 56 CHAPTER 6 CONCLUSION 61 6.1 Future Work 61 REFERENCES 63 | |
dc.language.iso | en | |
dc.title | 單一指令多重資料編譯器之開發 | zh_TW |
dc.title | Development of a SIMD Compiler | en |
dc.type | Thesis | |
dc.date.schoolyear | 95-2 | |
dc.description.degree | 碩士 | |
dc.contributor.oralexamcommittee | 蘇培陞,熊博安,王勝德 | |
dc.subject.keyword | 單一指令多重資料,編譯器,多媒體指令集, | zh_TW |
dc.subject.keyword | SIMD,compiler,SUIF, | en |
dc.relation.page | 65 | |
dc.rights.note | 有償授權 | |
dc.date.accepted | 2007-07-26 | |
dc.contributor.author-college | 電機資訊學院 | zh_TW |
dc.contributor.author-dept | 電機工程學研究所 | zh_TW |
顯示於系所單位: | 電機工程學系 |
文件中的檔案:
檔案 | 大小 | 格式 | |
---|---|---|---|
ntu-96-1.pdf 目前未授權公開取用 | 1.13 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。