請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/60212完整後設資料紀錄
| DC 欄位 | 值 | 語言 |
|---|---|---|
| dc.contributor.advisor | 洪士灝(Shih-Hao Hung) | |
| dc.contributor.author | Kuan-Wen Su | en |
| dc.contributor.author | 蘇冠文 | zh_TW |
| dc.date.accessioned | 2021-06-16T10:13:43Z | - |
| dc.date.available | 2018-08-23 | |
| dc.date.copyright | 2013-08-23 | |
| dc.date.issued | 2013 | |
| dc.date.submitted | 2013-08-19 | |
| dc.identifier.citation | [1] “Opencl: The open standard for parallel programming of heterogeneous systems.” http:
//www.khronos.org/opencl. [2] R. Ubal, B. Jang, P. Mistry, D. Schaa, and D. Kaeli, “Multi2sim: a simulation framework for cpu-gpu computing,” pp. 335–344, 2012. [Online]. Available: http://doi.acm.org/10.1145/2370816.2370865 [3] “The khronos group.” http://www.khronos.org/. [4] “The opencl specification 1.2,” http://www.khronos.org/registry/cl/specs/opencl-1.2.pdf. [5] “Evergreen family instruction set architecture reference guide,” http://developer.amd. com/wordpress/media/2012/10/AMD_Evergreen-Family_Instruction_Set_Architecture. pdf. [6] “The multi2sim simulation framework: A cpu-gpu model for heterogeneous computing,” http://www.multi2sim.org/files/multi2sim-r241.pdf. [7] A. Bakhoda, G. Yuan, W. Fung, H. Wong, and T. Aamodt, “Analyzing cuda workloads using a detailed gpu simulator,” pp. 163–174, 2009. [8] “Nvidia ptx: Parallel thread execution isa version 3.2,” http://docs.nvidia.com/cuda/ parallel-thread-execution/index.html. [9] “Nvidia cuda: Compute unified device architecture programming guide.” http://docs. nvidia.com/cuda/cuda-c-programming-guide/index.html. [10] S. Collange, M. Daumas, D. Defour, and D. Parello, “Barra: A parallel functional simu- lator for gpgpu,” pp. 351–360, 2010. [11] D. August, J. Chang, S. Girbal, D. Gracia-Perez, G. Mouchard, D. Penry, O. Temam, and N. Vachharajani, “Unisim: An open simulation environment and library for complex ar- chitecture design and collaborative development,” Computer Architecture Letters, vol. 6, no. 2, pp. 45–48, 2007. [12] “Decuda and cudasm, the cubin utilities package.” https://github.com/laanwj/decuda/wiki. [13] G. F. Diamos, A. R. Kerr, S. Yalamanchili, and N. Clark, “Ocelot: a dynamic optimization framework for bulk-synchronous applications in heterogeneous systems,” p. 353–364, 2010. [Online]. Available: http://doi.acm.org/10.1145/1854273.1854318 [14] H. Wang, V. Sathish, R. Singh, M. J. Schulte, and N. S. Kim, “Workload and power budget partitioning for single-chip heterogeneous processors,” pp. 401–410, 2012. [Online]. Available: http://doi.acm.org/10.1145/2370816.2370873 [15] N. Binkert, B. Beckmann, G. Black, S. K. Reinhardt, A. Saidi, A. Basu, J. Hestness, D. R. Hower, T. Krishna, S. Sardashti, R. Sen, K. Sewell, M. Shoaib, N. Vaish, M. D. Hill, and D. A. Wood, “The gem5 simulator,” SIGARCH Comput. Archit. News, vol. 39, no. 2, pp. 1–7, Aug. 2011. [Online]. Available: http://doi.acm.org/10.1145/2024716.2024718 [16] N. L. Binkert, R. G. Dreslinski, L. R. Hsu, K. T. Lim, A. G. Saidi, and S. K. Reinhardt, “The m5 simulator: Modeling networked systems,” IEEE Micro, vol. 26, no. 4, pp. 52–60, Jul. 2006. [Online]. Available: http://dx.doi.org/10.1109/MM.2006.82 [17] M. M. K. Martin, D. J. Sorin, B. M. Beckmann, M. R. Marty, M. Xu, A. R. Alameldeen, K. E. Moore, M. D. Hill, and D. A. Wood, “Multifacet’s general execution-driven mul- tiprocessor simulator (gems) toolset,” SIGARCH Comput. Archit. News, vol. 33, p. 2005, 2005. [18] Y. Chen, “Full-system profiling of android/linux applications on virtual platforms,” 2011. 36 [19] J.-H. Ding, P.-C. Chang, W.-C. Hsu, and Y.-C. Chung, “Pqemu: A parallel system em- ulator based on qemu,” in Parallel and Distributed Systems (ICPADS), 2011 IEEE 17th International Conference on, dec. 2011, pp. 276 –283. [20] F. Bellard, “Qemu, a fast and portable dynamic translator,” in Proceedings of the annual conference on USENIX Annual Technical Conference, ser. ATEC ’05. Berkeley, CA, USA: USENIX Association, 2005, pp. 41–41. [Online]. Available: http://dl.acm.org/citation.cfm?id=1247360.1247401 [21] “Android emulator,” http://developer.android.com/guide/developing/tools/emulator.html. [22] S.-H. Hung, T.-W. Kuo, C.-S. Shih, and C.-H. Tu, “System-wide profiling and optimiza- tion with virtual machines,” in Design Automation Conference (ASP-DAC), 2012 17th Asia and South Pacific, 30 2012-feb. 2 2012, pp. 395 –400. [23] W.-C. Hsu, S.-H. Hung, and C.-H. Tu, “A virtual timing device for program performance analysis,” in Computer and Information Technology (CIT), 2010 IEEE 10th International Conference on, 2010, pp. 2255–2260. [24] “Kcachegrind,” http://kcachegrind.sourceforge.net/html/Home.html. [25] “Vampir,” http://www.vampir.eu/. [26] “The rodinia benchmark suite.” http://lava.cs.virginia.edu/Rodinia/. | |
| dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/60212 | - |
| dc.description.abstract | 在異質性系統架構中,現今的電腦系統結合了多核心的中央處理單元(CPU)和多重的硬體加速處理器(Accelerator)來更有效率的執行應用程式。當異質性系統架構快速地變成一種主流時,各種不同種類的處理器例如圖形處理單元(GPU)、現場可程式邏輯門陣列(FPGA)和數位訊號處理器(DSP),也已經被廣泛的使用在特定應用程式的運算藉以提昇整體系統的效能。
開放式計算語言(OpenCL)提供了一個公開的標準框架,讓開發著可以利用各種不同種類的處理器來當作運算裝置。為了能夠幫助異質性系統架構的設計與OpenCL應用程式的分析,我們開發了一個異質性的虛擬平台。除此之外,作業系統在系統中扮演著系統程式的排程、檔案系統的管理、I/O的操作與電腦系統的網路,所以在執行數據密集的OpenCL應用程式時,我們認為一個完整系統層級的模擬環境也相當重要。 在本篇論文中將會介紹一個異質性的虛擬平台,我們可以在這個異質性虛擬平台上面做OpenCL應用程式的執行,這個平台上面包含模擬了多核心的中央處理器(multi-core CPU)和圖形處理器(GPU)。另外,我們也開發了一個在虛擬平台上的平行追蹤及效能剖析工具(PSET),PSET可以讓OpenCL應用程式的開發者藉由我們的平台提供的圖形展示分析界面作分析及優化。最後,我們藉由分析OpenCL的應用程式來展示我們這個平台的實用與幫助。我們的異質性虛擬平台,讓OpenCL應用程式開發者或是系統架構設計者可以針對系統作全面的分析與最佳化。 | zh_TW |
| dc.description.abstract | With a heterogeneous architecture, a modern computer systems combines multi-core CPUs and multiple accelerator cores to perform applications efficiently. As heterogeneous system architectures rapidly become the mainstream, various types of processors such as GPUs, FPGAs and DSPs, can be chosen to optimize a system for specific applications.
OpenCL provides a standard framework for leveraging different types of processors as a computing device. To aid the design of heterogeneous systems and the evaluation of the OpenCL applications, we develop a heterogeneous virtual platform. Since operating system is responsible for task scheduling, file systems, I/O operations, and networking, we believe a full-system emulation is essential for evaluating data-intensive OpenCL applications. In this thesis, we describe a heterogeneous virtual platform with multi-core CPUs and GPUs for executing the OpenCL applications. Furthermore, we developed an event-driven performance analysis toolkit, PSET, for performance profiling of OpenCL applications on the virtual platform. We demonstrate how our heterogeneous virtual platform works by analyzing the OpenCL applications with the help from graphical visualization and show that our framework are useful tools for system-wide optimization for developers and architects. | en |
| dc.description.provenance | Made available in DSpace on 2021-06-16T10:13:43Z (GMT). No. of bitstreams: 1 ntu-102-R00944032-1.pdf: 2291653 bytes, checksum: d66ff4351b77c116d4bf0ccf5b010cfd (MD5) Previous issue date: 2013 | en |
| dc.description.tableofcontents | Acknowledgments . . . . . . i
中文摘要 . . . . . . ii Abstract. . . .. . iii 1 Introduction . . .. .. 1 1.1 Motivations . . . .1 1.2 Thesis Organization . . . . . . . . . 3 2 Background and Related Works. . . . . . .. . . 4 2.1 OpenCL. . . . .4 2.1.1 OpenCL Platform Model . . . . . .5 2.1.2 OpenCL Memory Model . . . . . . . . . . . 6 2.1.3 OpenCL Execution Model . . . . . . . . . 8 2.1.4 OpenCL Programming Model . . . . . . . . 8 2.2 Multi2Sim . . . . . . . 9 2.3 Related Works . . . . . . . . . .10 3 Framework and Implementation . . . . . .. . . . 12 3.1 Functional Modeling. . .. 14 3.2 Performance Modeling. . . . . 17 3.3 PSET Framework . . . .19 4 Experiment . . . . . . .. 24 4.1 Experimental Setup. . . . . 24 4.2 Evaluation of OpenCL Applications. . . . . 25 4.2.1 Evaluation of a Median Filter Application . . . . .25 4.2.2 Evaluation of the Rodinia Benchmarks . . . . .28 5 Conclusion and Future Work . . . . . . . .. . . 33 5.1 Conclusion . . . . . . . . . .. . 33 5.2 Future Work . . . .. 34 Bibliography . . . . . . . . 35 | |
| dc.language.iso | en | |
| dc.subject | 異質性虛擬平台 | zh_TW |
| dc.subject | 效能分析 | zh_TW |
| dc.subject | 多核心平台 | zh_TW |
| dc.subject | 平行化處理 | zh_TW |
| dc.subject | OpenCL | zh_TW |
| dc.subject | Multi-core Platform | en |
| dc.subject | Parallel Processing | en |
| dc.subject | OpenCL | en |
| dc.subject | Performance Analysis | en |
| dc.subject | Heterogeneous Virtual Platform | en |
| dc.title | 在虛擬異質性平台上做OpenCL應用程式評估 | zh_TW |
| dc.title | Evaluating OpenCL Applications with Heterogeneous Virtual Platforms | en |
| dc.type | Thesis | |
| dc.date.schoolyear | 101-2 | |
| dc.description.degree | 碩士 | |
| dc.contributor.oralexamcommittee | 施吉昇(Chi-Sheng Shih),蘇文鈺(Wen-Yu Su),劉邦鋒(Pangfeng Liu),楊佳玲(Chia-Lin Yang) | |
| dc.subject.keyword | 異質性虛擬平台,效能分析,多核心平台,平行化處理,OpenCL, | zh_TW |
| dc.subject.keyword | Heterogeneous Virtual Platform,Performance Analysis,Multi-core Platform,Parallel Processing,OpenCL, | en |
| dc.relation.page | 37 | |
| dc.rights.note | 有償授權 | |
| dc.date.accepted | 2013-08-20 | |
| dc.contributor.author-college | 電機資訊學院 | zh_TW |
| dc.contributor.author-dept | 資訊網路與多媒體研究所 | zh_TW |
| 顯示於系所單位: | 資訊網路與多媒體研究所 | |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| ntu-102-1.pdf 未授權公開取用 | 2.24 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
