利用繪圖處理器平行運算實現基頻通道模擬與通訊接收機

Yu-Yen Chen; 陳語煙

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/44011

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	闕志達(Tzi-Dar Chiueh)
dc.contributor.author	Yu-Yen Chen	en
dc.contributor.author	陳語煙	zh_TW
dc.date.accessioned	2021-06-15T02:36:12Z	-
dc.date.available	2009-08-19
dc.date.copyright	2009-08-19
dc.date.issued	2009
dc.date.submitted	2009-08-13
dc.identifier.citation	[1] NVIDIA. 2008. CUDA Zone; http://www.nvidia.com/CUDA. [2] NVIDIA. 2008. CUDA Programming Guide, version 2.0. [3] NVIDIA. 2008. CUDA Reference Manual, version 2.0. [4] J. D. Owens, M. Houston, D. Luebke, S. Green, J. E. Stone, and J. C. Phillips, “GPU Computing,” Proc. IEEE, vol. 96, no. 5, pp. 879-899, May 2008. [5] E. Lindholm, J. Nickolls, S. Oberman, and J. Montrym, “NVIDIA Tesla: A Unified Graphics and Computing Architecture,” Micro. IEEE, vol. 28, no. 2, pp. 39-55, Mar.-Apr. 2008. [6] M. Garland, S. L. Grand, J. Nickolls, J. Anderson, S. Morton, E. Phillips, Y. Zhang, and V. Volkov, “Parallel Computing Experiences with CUDA,” Micro. IEEE, vol. 28, no. 4, pp. 13-27, Jul.-Aug. 2008. [7] Tom R. Halfhill, “Parallel Processing with CUDA,” Microprocessor Report, Jan. 28, 2008, from http://www.mdronline.com. [8] David Kanter, “NVIDIA’s GT200: Inside a Parallel Processor,” Real World Technologies, Sep. 8, 2008, from http://www.realworldtech.com. [9] David Kirk, and Wen-Mei W. Hwu. NVIDIA CUDA Programming Massively Parallel Processors. Retrieved Jun. 30-Jul. 2, 2008, from National Center for High-Performance Computing, Taiwan. https://edu.nchc.org.tw/course/one_course_introduction.asp?lms_auto_course_id=992&from_course_list_url=homepage [10] GPGPU, http://gpgpu.org/. [11] Tzi-Dar Chiueh, and Pei-Yun Tsai, OFDM Baseband Receiver Design for Wireless Communications. Taiwan, John Wiley & Sons, Inc., 2007. [12] J. G. Proakis, Digital Communications, 4th Ed., McGraw-Hill, 2001. [13] D. B. Lloyd, C. Boyd, and N. Govindaraju, “Fast Computation of General Fourier Transforms on GPUs,” IEEE International Conference on Multimedia and Expo., pp. 5-8, Jun. 23 – Apr. 26, 2008. [14] N. K. Govindaraju, B. Lloyd, Y. Dotsenko, B. Smith, and J. Manferdelli, “High Performance Discrete Fourier Transforms on Graphics Processors,” ACM/IEEE Conference on Supercomputing, Nov. 15-21, 2008. [15] D. R. Horn, M. Houston, and P. Hanrahan, “ClawHMMER: A Streaming HMMer-Search Implementation,” ACM/IEEE Conference on Supercomputing, Nov. 12-18, 2005. [16] V. Balu, J. P. Walters, S. Kompalli, and V. Chaudhary, “Evaluating the Use of GPUs for Life Science Application,” Tech. Report, University at Buffalo, New York, May 22, 2008. [17] C. Y. Wang, C. H. Liu, C. H. Liao, and T. D. Chiueh, A High Performance Integrated Baseband 16x16 MIMO Wireless Channel Simulator with 32-Bit Floating-Point Precision, National Taiwan University, 2008. [18] C. H. Liao, T. P. Wang, and T. D. Chiueh, “A Novel Low-Complexity Rayleigh Fader for Real-Time Channel Modeling,” 2007 IEEE International Symposium on Circuit and System, New Orleans, May 27-30, 2007. [19] E. Costa, and S. Pupolin, “M-QAM-OFDM Wywtem Performance in the Presence of a Nonlinear Amplifier and Phase Noise,” IEEE Trans. Commun., vol. 50, pp. 462-472, Mar. 2002. [20] Keng-Hsien Lin, Design of a Baseband Receiver for DVB-T Standard, Master Thesis, Graduate of Institute Electronics Engineering, National Taiwan University, Taipei, Taiwan, Jul. 2005. [21] “Digital Video Broadcasting (DVB); Framing Structure, Channel Coding and Modulation for Digital Terrestrial Television,” ETSI EN 300 744 v1.5.1, Nov. 2004. [22] “Digital Video Broadcasting (DVB); Implementation Guidelines for DVB Terrestrial Service; Transmission Aspects,” ETSI TR 101 190 v1.2.1, Nov. 2004. [23] Rohde & Schwarz Crop., Digital TV Rigs and Recipes, 2006. [24] Bevan M. Baas, “A Generalized Cached-FFT Algorithm,” IEEE ICASSP, vol. 5, pp. 89-92, Mar. 18-23, 2005. [25] C. M. Chen, and Y. H. Huang, “Partial Cached-FFT Algorithm for OFDMA Communications,” IEEE TENCON, pp. 1-4, Oct. 30-Nov. 2, 2007.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/44011	-
dc.description.abstract	繪圖處理器(Graphics Processing Unit)經過數代的演進，由只具備固定功能的運算單元漸漸發展而具有高度可程式化的能力。由於繪圖處理器通常具有相當高的記憶體頻寬，以及極佳的浮點運算能力，因此開始有利用繪圖處理器來加速一些非圖形處理計算工作的想法，即通用運算繪圖處理器(General Purpose Computing on Graphics Processing Units)。然而，並非所有運算皆適合使用繪圖處理器加速，於繪圖處理器處理之運算需為重複且低資料相依性，即具備高平行化程度之特性。本論文中，將使用統一計算架構(Compute Unified Device Architecture)，也就是NVIDIA公司通用運算繪圖處理器的模型，平行化地實現正交分頻多工(Orthogonal Frequency-Division Multiplexing)通訊系統中常見的快速傅立葉轉換(Fast Fourier Transform)，以及修正傳送錯誤常用的維特比解碼器(Viterbi Decoder)。同時也將平行化的概念，應用到基頻通道模擬器以及通訊接收機中，將需要耗費大量時間且能夠平行化實現之運算，由中央處理器(CPU)移轉至繪圖處理器(GPU)中處理，而達到加速的目的。於本論文中，除了呈現使用繪圖處理器加速之方法與成果外，也將分享繪圖處理器程式設計(GPU Programming)之經驗。	zh_TW
dc.description.abstract	The Graphics Processing Unit (GPU) has been evolving for various generations, from a fixed-funcion graphics pipeline to a programmable parallel processor. Due to the fact that GPU has a high memory bandwidth and a high Giga FLoating points Operations Per Second (GFLOPS) performance, it is used to accelerate the computations of non-graphics data, which is also referred as General Purpose Computing on Graphics Processing Units (GPGPU). However, not all algorithms benefit equally from this technique. The GPU computing is only suitable for algorithms with repeating computation and low data dependency, which means with a high degree of parallel computing characteristics. There are mainly two groups of communication applications analyzed and implemented in this thesis, which are all implemented and simulated using general personal computer under the NVIDIA CUDA environment. First, the method and result of the baseband channel simulation implementation using parallel computing concept is presented in the thesis. Then the parallel computing concepts are being used to implement the communication receiver. Both of the speedup performance and the GPU programming experience will be covered in this thesis.	en
dc.description.provenance	Made available in DSpace on 2021-06-15T02:36:12Z (GMT). No. of bitstreams: 1 ntu-98-R96943026-1.pdf: 8115836 bytes, checksum: f1ab9a1998640098f351a0cbb89bee91 (MD5) Previous issue date: 2009	en
dc.description.tableofcontents	目錄 I 圖目錄 V 表目錄 IX 第一章緒論 1 1.1. 研究背景 1 1.2. 研究動機 3 1.3. 通用計算繪圖處理器簡介 3 1.3.1. 繪圖處理器之演進 4 1.3.2. 通用計算繪圖處理器軟體環境(GPGPU Software Environments) 6 1.4. 論文組織介紹 8 第二章統一計算設備架構(COMPUTE UNIFIED DEVICE ARCHITECTURE)介紹 9 2.1. 本章簡介 9 2.2. 程式設計模型(PROGRAMMING MODEL) 10 2.2.1. 執行模型 10 2.2.2. 執行緒層次架構(Thread Hierarchy) 11 2.2.3. 記憶體層次架構(Memory Hierarchy) 12 2.3. 繪圖處理器硬體架構 15 2.3.1. 繪圖處理器硬體架構簡介 15 2.3.2. 執行緒排程(Thread Scheduling) 16 2.4. 應用程式介面(API)簡介 18 2.4.1. C程式語言的擴展 18 2.4.1.1. 函式類型限定詞(Function Type Qualifiers) 18 2.4.1.2. 變數類型限定詞(Variable Type Qualifiers) 19 2.4.1.3. 內置變數(Built-in Variables) 19 2.4.2. 常用用法 20 2.4.2.1. 設備函式(Kernel) 20 2.4.2.2. 主機呼叫設備函式(Kernel) 21 2.4.2.3. 設備(Device)記憶體用法 23 2.5. 程式設計建議(PROGRAMMING GUIDELINE) 26 第三章利用繪圖處理器平行運算之實例 29 3.1. 本章簡介 29 3.2. 快速傅立葉轉換(FAST FOURIER TRANSFORM)之平行化實現 30 3.2.1. 快速傅立葉轉換之序列(Serial)實現方法 30 3.2.2. 快速傅立葉轉換之平行化實現方法 32 3.2.2.1. Cooley-Tukey 演算法 32 3.2.2.2. Stockham演算法 34 3.2.2.3. Large Size FFT 37 3.2.2.4. Radix-4 FFT 38 3.2.2.5. Cached-FFT演算法 39 3.2.3. 平行化實現之加速效果 41 3.3. 維特比解碼器(VITERBI DECODER)之平行化實現 43 3.3.1. 維特比解碼器之列(Serial)實現方法 43 3.3.2. 維特比解碼器之之平行化實現方法 46 3.3.2.1. Radix-2維特比解碼器 46 3.3.2.2. Radix-4維特比解碼器 49 3.3.3. 平行化實現之加速效果與討論 50 第四章基頻通道模擬之平行化實現 55 4.1. 本章簡介 55 4.2. 基頻通道模擬簡介 55 4.2.1. 多重路徑瑞雷衰減通道模型 55 4.2.2. 接收端非理想效應 57 4.3. 瑞雷衰減模型(RAYLEIGH FADER) 59 4.3.1. 平行化實現瑞雷衰減模型之架構 59 4.3.2. Modified Wu’s Model之平行化實現 62 4.3.3. 萊斯衰減(Rician Fading)通道模型之平行化實現 66 4.3.4. 矩陣相乘之平行化實現 67 4.3.5. 平行化實現瑞雷衰減模型之加速效果 69 4.4. 基頻通道模擬之加速效果 71 4.4.1. 平行化實現基頻通道模擬之架構 71 4.4.2. 平行化實現基頻通道模擬之加速效果 72 第五章歐規數位電視地面廣播(DVB-T)接收機之平行化實現 75 5.1. 本章簡介 75 5.2. 歐規數位電視地面廣播概述 75 5.2.1. 正交分頻多工(OFDM)調變介紹 75 5.2.2. DVB-T系統規格簡介 78 5.2.3. 傳送端(Transmitter)系統架構 81 5.3. 接收端(RECEIVER)系統架構 84 5.3.1. 方塊圖(Block Diagram) 84 5.3.2. 初始擷取(Acquisition)與同步 84 5.3.3. 追蹤(Tracking)迴路 85 5.3.4. 資料回復(Data Recovery) 85 5.4. 接收端(RECEIVER)之平行化實現 88 5.4.1. 方塊圖(Block Diagram) 88 5.4.2. 快速傅立葉轉換(FFT) 89 5.4.3. 相位調整(Phase Modification) 89 5.4.4. 通道估測與補償 90 5.4.4.1. 時域內插器(Time-Domain Interpolatior) 91 5.4.4.2. 頻域等化器(FEQ) 92 5.4.5. 解映射器(De-Mapper) 92 5.4.6. 反交錯器(De-interleaver) 93 5.4.6.1. 反符元交錯器(De-Symbol-Interleaver) 93 5.4.6.2. 反位元交錯器(De-Bit-Interleaver) 94 5.4.6.3. 多工器(MUX) 95 5.4.7. 維特比解碼器(Viterbi Decoder) 95 5.5. 接收端(RECEIVER)平行化實現之加速效果 97 第六章結論與展望 101 參考資料 103
dc.language.iso	zh-TW
dc.title	利用繪圖處理器平行運算實現基頻通道模擬與通訊接收機	zh_TW
dc.title	Baseband Channel Simulation and Communication Receiver Implementation Using GPU Acceleration	en
dc.type	Thesis
dc.date.schoolyear	97-2
dc.description.degree	碩士
dc.contributor.oralexamcommittee	李學智,蔡佩芸
dc.subject.keyword	通用運算繪圖處理器,統一計算設備架構,歐規數位電視,基頻數位通訊,	zh_TW
dc.subject.keyword	GPGPU,CUDA,DVB-T,Digital Baseband Communication,	en
dc.relation.page	105
dc.rights.note	有償授權
dc.date.accepted	2009-08-13
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	電子工程學研究所	zh_TW
顯示於系所單位：	電子工程學研究所

文件中的檔案：

檔案	大小	格式
ntu-98-1.pdf 目前未授權公開取用	7.93 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。