基於中央處理器及圖形處理單元之協同運算

Shao-Yun Kuang; 鄺劭昀

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/52955

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	廖世偉
dc.contributor.author	Shao-Yun Kuang	en
dc.contributor.author	鄺劭昀	zh_TW
dc.date.accessioned	2021-06-15T16:35:55Z	-
dc.date.available	2017-08-31
dc.date.copyright	2015-08-31
dc.date.issued	2015
dc.date.submitted	2015-08-12
dc.identifier.citation	[1] Sylvain Paris Marc Levoy Saman Amarasinghe Fredo Durand. Jonathan Ragan- Kelley, Andrew Adams. Decoupling algorithms from schedules for easy optimiza- tion of image processing pipelines. [2] Jonathan Ragan-Kelley,Connelly Barnes,Andrew Adams,Sylvain Paris,Fredo Du- rand,Saman Amarasinghe. Halide: a language and compiler for optimizing paral- lelism, locality, and recomputation in image processing pipelines. In Proceedings of the 34th ACM SIGPLAN Conference on Programming Language Design and Im- plementation. [3] R. Govindarajan Prasanna Pandit. Fluidic kernels: Cooperative execution of opencl programs on multiple heterogeneous devices. In Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization. [4] Khronos Group. Opencl - the open standard for parallel programming of heteroge- neous systems. 2013. [5] C. omasi and R. Manduchi. Bilateral filtering for gray and color images. In In Proceedings of the IEEE International Conference on Computer. [6] By Sylvain Paris, Samuel W. Hasinoff, Jan Kautz. Local laplacian filters: Edge- aware image processing with a laplacian pyramid. 2015. [7] Nvidia. New top500 list: 4x more gpu supercomputers. 2012. [8] L. Nyland, M Harris,J Prins. Fast n-body simulation with cuda. 2012. [9] V. W. Lee, C. Kim, J. Chhugani, M. Deisher, D. Kim, A. D.Nguyen, N. Satish, M. Smelyanskiy, S. Chennupaty,P. Hammarlund, R. Singhal, and P. Dubey. . Debunk- ing the100x gpu vs. cpu myth: an evaluation of throughput computing on cpu and gpu. 2012. [10] Jonathan Ragan-Kelley. Decoupling Algorithms from the Organization of Compu- tation for High Performance Image Processing: The design and implementation of the Halide language and compiler. PhD thesis, MIT.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/52955	-
dc.description.abstract	在這篇論文裡,我們打造了一個協同運算的框架做為 Halide 的附加元件,藉以利用 Halide 之優勢試圖去創造一個較好的異質計算環境。在此框架中我們提供兩種的分配工作量方法,這兩種工作法方可以提供不同程度的加速優化。我們會在這篇論文中測量兩種分配方法提供的優化程度以及在不同輸入大小的優化程度。在實驗中使用我們的框架搭配最佳化過的 Halide 的 CPU 及 GPU 程式可以得到 1.21x,1.55x,1.16x 的加速在 Nexus 7,Intel CPU 搭載 ATI GPU 和 Intel CPU 搭載 Nvidia GPU.	zh_TW
dc.description.abstract	In this thesis,we create a synergistic framework as Halide extension to build a better heterogeneous computing environment with the advantage of Halide.Our framework support 2 methods - dynamic dispatch and static dis- patch to assign different workload to CPU and GPU with those dispatch method we will get different level speedup.And we will measure those dispatch meth- ods with different input sizes at mobile,ATI GPU and NVIDIA GPU.In our experiment, using our framework with optimized Halide CPU and GPU funci- ton can get 1.21x speedup at Nexus 7, 1.55x speedup at Inetel CPU with ATI GPU and 1.16x speedup at Intel CPU with Nvidia GPU.	en
dc.description.provenance	Made available in DSpace on 2021-06-15T16:35:55Z (GMT). No. of bitstreams: 1 ntu-104-R02922099-1.pdf: 776550 bytes, checksum: 66e9d1bc46a613a73d52d3e5a3073a9a (MD5) Previous issue date: 2015	en
dc.description.tableofcontents	Contents 致謝 i Abstract ii 摘要 iii 1 Motivation 1 2 Background and Related work 2 2.1 IntroductionOfHalide ........................... 2 2.2 IntroductionofOpenCL .......................... 3 2.3 FluidicKernels ............................... 4 3 Implementation 6 3.1 Overview .................................. 6 3.2 Buffersetup................................. 8 3.3 StaticDispatch ............................... 10 3.4 DynamicDispatch ............................. 10 3.5 Usageofourframework .......................... 12 4 Experiments 14 4.1 StaticDispatch ............................... 18 4.2 DynamicDispatch ............................. 22 4.3 ProfileOpenCLRuntimeinformation ................... 23 4.4 CUDAandOpenCL ............................ 23 4.5 DifferentInputSize............................. 25 4.6 OpenCLAPIexectiontimewithlargersize . . . . . . . . . . . . . . . . 26 5 Conclusion 28 Bibliography 29
dc.language.iso	en
dc.title	基於中央處理器及圖形處理單元之協同運算	zh_TW
dc.title	A Synergistic Framework base on Halide	en
dc.type	Thesis
dc.date.schoolyear	103-2
dc.description.degree	碩士
dc.contributor.oralexamcommittee	徐慰中,梁伯嵩
dc.subject.keyword	協同運算,中央處理器,圖形處理器,異質計算,	zh_TW
dc.subject.keyword	synergistic computing,CPU,GPU,Heterogeneous computing,	en
dc.relation.page	30
dc.rights.note	有償授權
dc.date.accepted	2015-08-12
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	資訊工程學研究所	zh_TW
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
ntu-104-1.pdf 目前未授權公開取用	758.35 kB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。