使用圖形處理單元加速時域有限差分法的計算

Jyun-Lin Cian; 錢俊霖

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/46583

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	邱奕鵬
dc.contributor.author	Jyun-Lin Cian	en
dc.contributor.author	錢俊霖	zh_TW
dc.date.accessioned	2021-06-15T05:16:58Z	-
dc.date.available	2010-07-26
dc.date.copyright	2010-07-26
dc.date.issued	2010
dc.date.submitted	2010-07-21
dc.identifier.citation	[1] K. S. Yee, “Numerical solution of initial boundary value problems involving Maxwell's equations in isotropic media,” IEEE Transactions on Antennas and Propagation, vol. 14, no. 3, pp. 302-307, May 1966. [2] A. Taflove and S. Hagness, Computational Electrodynamics : The Finite-Difference Time-Domain Method, 3rd ed. Boston: Artech House, 2005. [3] V. Varadarajan and R. Mittra, “Finite-difference time-domain (FDTD) analysis using distributed computing,” IEEE Microwave and Guided Wave Letters, vol. 4, no. 5, pp. 144-145, May 1994. [4] J. P. Durbano, F. E. Ortiz, J. R. Humphrey, D. W. Prather, and M. S. Mirotznik, “Hardware implementation of a three-dimensional finite-difference time-domain algorithm,” IEEE Antennas and Wireless Propagation Letters, vol. 2, pp. 54-57, 2003. [5] http://gpgpu.org/. [6] NVIDIA, CUDA Programming Guide, version 3.0, 2010. [7] AMD, ATI Stream SDK OpenCL Programming Guide, version 1.0, 2010. [8] http://en.wikipedia.org/wiki/OpenCL [9] S. E. Krakiwsky, L. E. Turner, and M. M. Okoniewski, “Acceleration of finite-difference time-domain (FDTD) using graphics processor units (GPU),” IEEE MTT-S International Microwave Symposium Digest, vol. 2, pp. 1033-1036, Jun. 2004. [10] T. Namiki, “A new FDTD algorithm based on alternating direction implicit method,” IEEE J. Trans. Micro. Theo. and Tech. Lett., vol. 47, pp. 2003-2007, 1999. [11] L. Du, K. Li, F. Kong, “Parallel 3D finite difference time domain simulations on graphics processors with CUDA,” 2009 International Conference on Computational Intelligence and Software Engineering, China, Dec. 2009. [12] N. Takada, T. Shimobaba, N. Masuda, and T. Ito, “High-speed FDTD simulation algorithm for GPU with compute unified device architecture,” IEEE Antennas and Propagation Society International Symposium, Jul. 2009. [13] J. R. Humphrey, D. K. Price, J. K. Durbano, E. J. Kelmelis, and R. D. Martin, “High performance 2D and 3D FDTD solvers on GPUs,” Proceedings of the 10th WSEAS International Conference on Applied Mathematics, Dallas, Texas, USA, 2006, pp. 547-550 [14] M. J. Inman and A. Z. Elsherbeni, “Programming video cards for computational electromagnetics applications,” IEEE Antennas and Propagation Magazine, vol. 47, no. 6, pp. 71-78, Dec. 2005. [15] P. Sypek, A. Dziekonski, and M. Mrozowski, “How to render FDTD computations more effective using a graphics accelerator,” IEEE Transactions on Magnetics, vol. 45, no. 3, pp. 1324-1327, Mar. 2009. [16] P. Micikevicius, “3D finite difference computation on GPUs using CUDA,” Proceedings of 2nd Workshop on General Purpose Processing on Graphics Processing Units, GPGPU-2, pp. 79, Mar. 2009. [17] D. Michéa and D. Komatitsch, “Accelerating a three-dimensional finite-difference wave propagation code using GPU graphics cards,” Geophysical Journal International, vol. 182, no. 1, pp. 389-402, Jul. 2010. [18] J. D. Owens, M. Houston, D. Luebke, S. Green, J. E. Stone, and J. C. Phillips, “GPU computing,” Proceedings of the IEEE, vol. 96, no. 5, pp. 879-899, May 2008. [19] David Kirk and Wen-Mei W. Hwu., NVIDIA CUDA Programming Massively Parallel Processors, Morgan Kaufmann, 2010. [20] http://fdtd.wikispaces.com/ [21] G. Mur, “Absorbing boundary conditions for the finite-difference approximation of the time-domain electromagnetic-field equation,” IEEE Transactions on Electromagnetic Compatibility, vol. EMC-23, no. 4, pp. 377-382, Nov. 1981. [22] R. L. Higdon, “Absorbing boundary conditions for difference approximations to the multi-dimensional wave equation,” Mathematics of Computation, vol. 47, no. 176, pp. 437-459, Oct. 1986. [23] Z. P. Liao, H. L. Wong, B.-P. Yang, and Y.-F. Yuan, “A transmitting boundary for transient wave analysis,” Scientia Sinica, Series A, vol. 27, no. 10, pp.1063-1076, 1984. [24] J. P. Berenger, “A perfectly matched layer for the absorption of electromagnetics waves,” Journal of Computational Physics, vol. 114, no. 2, pp.185-200, Oct. 1994. [25] Z. S Sacks, D. M. Kingsland, R. Lee, and J. F. Lee, “A perfectly matched anisotropic absorber for use as an absorbing boundary condition,” IEEE Transactions on Antennas and Propagation, vol. 43, no. 12, pp. 1460-1463, Dec. 1995. [26] 張舒, 褚艷利, GPU高性能運算之CUDA, 中國水利水電出版社, Oct. 2009.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/46583	-
dc.description.abstract	時域有限差分法是一種應用很廣泛的數值方法，但是由於計算量過於龐大，常常造成漫長的等待時間。近年來有數種硬體加速的方法被研究，其中一種即是本文使用的圖形處理器 (GPU) 。GPU相較於其它的被研究的硬體有著較易取得、較為便宜、計算能力強的特性，我們認為很有加速時域有限差分法的潛力。本文使用的繪圖處理器是NVIDIA的GTX260；作為比較對象的中央處理器是使用INTEL的I7 930。本文實作的項目為二維與三維、單精度浮點數與雙精度浮點交叉一共有4種，進行各種的最佳化，最後實作雙顯示卡並行運算的系統。在效能測試的部分，以使用單精度浮點數為例，在二維的項目中，GPU可以達到CPU的23.5倍，而在三維的項目中，GPU可以達到CPU的35.9倍。另外在雙顯示卡並行運算的系統，計算效率最高可以達到單顯示卡的1.95倍。	zh_TW
dc.description.provenance	Made available in DSpace on 2021-06-15T05:16:58Z (GMT). No. of bitstreams: 1 ntu-99-R97941097-1.pdf: 2990974 bytes, checksum: 4ae5ed0f222d55a1f7be3e1b9a352e9c (MD5) Previous issue date: 2010	en
dc.description.tableofcontents	口試委員會審定書 I 致謝 II 中文摘要 III 英文摘要 IV 目錄 V 圖目錄 VIII 表目錄 X 第一章　緒論 1 1.1. 研究動機 1 1.2. GPGPU的簡介 1 1.3. 文獻回顧 5 1.4. 章節介紹 7 第二章　統一計算設備架構 (CUDA) 的介紹 9 2.1. GPU硬體架構 9 2.1.1. GPU硬體架構的發展 9 2.1.2. 支援CUDA的GPU硬體架構 10 2.2. CUDA的程式設計模型 11 2.3. CUDA的執行緒模型 13 2.3.1. 執行緒的結構 13 2.3.2. 執行緒的執行與排程 14 2.4. CUDA記憶體模型 16 第三章　數值方法 19 3.1. 時域有限差分法 (FDTD) 19 3.1.1. 馬克斯威爾方程式 19 3.1.2. Yee的演算法 20 3.1.3. 3D FDTD方程式的推導 21 3.1.4. FDTD的穩定條件 24 3.2. 吸收邊界條件 24 3.2.1. 完美匹配層 (PML) 吸收邊界條件 25 3.2.2. 同軸完美匹配層 (UPML) 吸收邊界條件 27 3.2.3. 吸收邊界的設定與調整 31 第四章　實作與效能分析 35 4.1. FDTD實作項目的說明 36 4.2. CPU與GPU的誤差比較 39 4.3. GPU初步的效能比較 42 4.3.1. CPU效能的定位 42 4.3.2. Block維度大小的選定 43 4.3.3. 2D FDTD的GPU效能 44 4.3.4. 3D FDTD的GPU效能 46 4.4. GPU效能的最佳化 48 4.4.1. 提升全域記憶體的存取效率 48 4.4.2. 減少全域記憶體的讀取次數 52 4.4.3. GPU最佳化的效能比較 56 4.5. Host與Device間的非同步傳輸 57 4.6. 多GPU的並行運作 60 4.6.1. 雙顯示卡的系統 60 4.6.2. 四顯示卡的系統 63 4.6.3. 預估多台主機並行運算的傳輸限制 65 第五章　結論與未來展望 67 參考文獻 69
dc.language.iso	zh-TW
dc.subject	繪圖處理器	zh_TW
dc.subject	平行處理	zh_TW
dc.subject	統一計算設備架構	zh_TW
dc.subject	通用運算繪圖處理器	zh_TW
dc.subject	時域有限差分法	zh_TW
dc.subject	FDTD	en
dc.subject	CUDA	en
dc.subject	GPU	en
dc.subject	GPGPU	en
dc.subject	parallel processing	en
dc.title	使用圖形處理單元加速時域有限差分法的計算	zh_TW
dc.title	Acceleration of the Finite-Difference Time-Domain Method Using Graphics Processing Unit	en
dc.type	Thesis
dc.date.schoolyear	98-2
dc.description.degree	碩士
dc.contributor.oralexamcommittee	許文翰,王子建
dc.subject.keyword	時域有限差分法,繪圖處理器,通用運算繪圖處理器,統一計算設備架構,平行處理,	zh_TW
dc.subject.keyword	FDTD,GPU,GPGPU,CUDA,parallel processing,	en
dc.relation.page	71
dc.rights.note	有償授權
dc.date.accepted	2010-07-21
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	光電工程學研究所	zh_TW
顯示於系所單位：	光電工程學研究所

文件中的檔案：

檔案	大小	格式
ntu-99-1.pdf 未授權公開取用	2.92 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。