多視角視訊編碼技術之演算法和硬體架構及系統設計之研究

Li-Fu Ding; 丁立夫

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/9417

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	陳良基(Liang-Gee Chen)
dc.contributor.author	Li-Fu Ding	en
dc.contributor.author	丁立夫	zh_TW
dc.date.accessioned	2021-05-20T20:21:38Z	-
dc.date.available	2010-03-10
dc.date.available	2021-05-20T20:21:38Z	-
dc.date.copyright	2009-03-10
dc.date.issued	2008
dc.date.submitted	2009-02-03
dc.identifier.citation	[1] O. Schreer, P. Kauff, and T. Sikora, 3D video communication: algorithms, concepts and real-time systems in human centred communication, 2005. [2] B. S. Wilburn, M. Smulski, H.-H. K. Lee, and M. A. Horowitz, “Light field video camera,” in Proceedings of Media Processors, SPIE ElectronicImaging, 2002, vol. 4674, pp. 29–36. [3] C. Zhang and T. Chen, “A self-reconfigurable camera array,” in Eurographics symposium on Rendering, 2004. [4] http://en.wikipedia.org/wiki/High-definition television, “High-definition television,” in Wikipedia. [5] Video Codec for Audiovisual Services at p £ 64 Kbit/s, ITU-T Recommendation H.261, Mar. 1993. [6] Information Technology - Coding of Moving Pictures and Associated Audio for Digital Storage Media at up to about 1.5 Mbit/s - Part 2: Video, ISO/IEC 11172-2, 1993. [7] Information Technology - Generic Coding of Moving Pictures and Associated Audio Information: Video, ISO/IEC 13818-2 and ITU-T Recommendation H.262, 1996. [8] Video Coding for Low Bit Rate Communication, ITU-T Recommendation H.263, May 1996. [9] Video Coding for Low Bit Rate Communication, ITU-T Recommendation H.263 version 2, Sept. 1997. 219 220 [10] Video Coding for Low Bit Rate Communication, ITU-T Recommendation H.263 version 3, Feb. 1998. [11] Information Technology - Coding of Audio-Visual Objects - Part 2: Visual, ISO/IEC 14496-2, 1999. [12] Joint Video Team, Draft ITU-T Recommendation and Final Draft International Standard of Joint Video Specification, ITU-T Recommendation H.264 and ISO/IEC 14496-10 AVC, 2003. [13] A. Joch, F. Kossentini, H. Schwarz, T. Wiegand, and G. J. Sullivan, “Performance comparison of video coding standards using lagragian coder control,” in Proc. of IEEE International Conference on Image Processing, 2002. [14] G. Sullivan, P. Topiwala, and A. Luthra, “The H.264 advanced video coding standard : Overview and introduction to the fidelity range extensions,” in SPIE Conference on Application of Digital Image Processing XXVII, Aug. 2004. [15] D. Marpe and et al., “H.264/MPEG4-AVC fidelity range extensions : Tools, profiles, performance, and application areas,” in Proc. IEEE ICIP, Sept. 2005, vol. 1, pp. 593–596. [16] J. Reichel, H. Schwarz, and M. Wien, “Working Draft 4 of ISO/IEC 14496- 10:2005/AMD3 Scalable Video Coding,” ISO/IEC JTC1/SC29/WG11 and ITU-T SG16 Q.6, Doc. N7555, Jan. 2005. [17] ISO/IEC JTC 1/SC 29/WG11 N1088, Proposed draft amendament No. 3 to 13818-2 (multi-view profile), MPEG-2, 1995. [18] S.-Y. Chien, S.-H. Yu, L.-F. Ding, Y.-N. Huang, and L.-G. Chen, “Efficient stereo video coding system for immersive teleconference with two-stage hybrid disparity estimation algorithm,” in Proceedings of 2003 IEEE International on Image Processing, 2003. [19] J. M. Martinez, “Mpeg 3dav ahg activities report,” Munich University of Technology, July 2003. 221 [20] Joint Video Team of ISO/IEC MPEG and ITU-T VCEG, “Joint multiview video model (jmvm) 8.0,” ISO/IEC JTC1/SC29/WG11 and ITU-T SG16 Q.6, Apr. 2008. [21] F. Isgr`o, E. Trucco, P. Kauff, and O. Schreer, “Three-dimensional image processing in the future of immersive media,” IEEE Trans. Circuits Syst. Video Technol., vol. 14, no. 3, pp. 388–303, Mar. 2003. [22] A. Smolic and P. Kauff, “Interactive 3-D video representation and coding technologies,” Proc. IEEE, vol. 93, no. 1, pp. 98–110, Jan. 2005. [23] M. Tanimoto, “Free viewpoint television - FTV,” in Proceedings of 2004 Picture Coding Symposium, Dec. 2004. [24] T. Fujii and M. Tanimoto, “Free-viewpoint TV system based on ray-space representation,” Proceedings of SPIE, vol. 4864, pp. 175–189, Mar. 2002. [25] A. Smolic, K. Mueller, P. Merkle, T. Rein, M. Kautmer, P. Eisert, and T. Wiegand, “Free viewpoint video extraction, representation, coding, and rendering,” in Proceedings of 2003 IEEE International on Image Processing, Oct. 2004, vol. 5, pp. 3287–3290. [26] ISO/IEC JTC1/SC29/WG11 N6501, Requirements on multi-view video coding, 2004. [27] ISO/IEC JTC1/SC29/WG11 N6501 W8019, Description of Core Experiments in MVC, Apr. 2006. [28] G. Li and Y. He, “A novel multi-view video coding scheme based on H.264,” in Proceedings of the 2003 Joint Conference of the Fourth International Conference on Information, Communications and Signal Processing, Dec. 2003, vol. 4, pp. 218–222. [29] Y.-W. Huang, T.-C. Chen, C.-H. Tsai, C.-Y. Chen, T.-W. Chen, C.-S. Chen, C.- F. Shen, S.-Y. Ma, T.-C. Wang, B.-Y. Hsieh, H.-C. Fang, and L.-G. Chen, “A 1.3TOPS H.264/AVC single-chip encoder for HDTV applications,” 2005. 222 [30] Y. Ohta and T. Kanade, “Stereo by intra- and inter-scanline search using dynamic programming,” IEEE Trans. Pattern Anal. Machine Intell., vol. 7, no. 2, pp. 139–154, Mar. 1985. [31] N. Grammalidis and M. G. Strintzis, “Disparity and occlusion estimation in multiocular systems and their coding for the communication of multiview image sequences,” IEEE Trans. Circuits Syst. Video Technol., vol. 8, no. 3, pp. 328–344, June 1998. [32] J.-R. Ohm and K. M¨uller, “Incomplete 3-D multiview representation of video objects,” IEEE Trans. Circuits Syst. Video Technol., vol. 9, no. 2, pp. 389–400, Mar. 1999. [33] R.-S. Wang and Y. Wang, “Multiview video sequence analysis, compression, and virtual viewpoint synthesis,” IEEE Trans. Circuits Syst. Video Technol., vol. 10, no. 3, pp. 397–410, Apr. 2000. [34] G. Heising, “Efficient and robust motion estimation in grid-based hybrid video coding schemes,” in Proceedings of International Conference on Image Processing, 2002, pp. 687–700. [35] Y. Luo, Z. Zhang, and P. An, “Stereo video coding based on frame estimation and interpolation,” IEEE Trans. Broadcast., vol. 49, no. 1, pp. 14–21, Jan. 2003. [36] X. Guo, Y. Lu, and W. Gao, “Inter-view direct mode for multiview video coding,” IEEE Trans. Circuits Syst. Video Technol., vol. 16, no. 12, pp. 1527–1532, Dec. 2006. [37] L.-F. Ding, S.-Y. Chien, and L.-G. Chen, “Joint prediction algorithm and architecture for stereo video hybrid coding systems,” IEEE Trans. Circuits Syst. Video Technol., vol. 16, no. 11, pp. 1324–1337, Nov. 2006. [38] P.-L. Lai and A. Ortega, “Predictive fast motion/disparity search for multiview video coding,” in Proceedings of SPIE Visual Communications and Image Processing 2006, 2006. 223 [39] ISO/IEC JTC1/SC29/WG11 N6501 m13544, Results on CE1 for multi-view video coding, ISO/IEC, July 2006. [40] J. Ostermann, J. Bormans, P. List, D. Marpe, M. Narroschke, F. Pereira, T. Stockhammer, and T. Wedi, “Video coding with H.264/AVC: tools, performance, and complexity,” IEEE Circuits Syst. Mag., vol. 4, no. 1, pp. 7–28, June 2004. [41] ISO/IEC JTC1/SC29/WG11 N6501 N7829, AHG on Multiview Video Coding, ISO/IEC, Jan. 2006. [42] Y. Wang, J. Ostermann, and Y.-Q. Zhang, Video processing and communication, Prentice Hall, 2001. [43] T. Koga, K. Linuma, A. Hirano, Y. Iijima, and T. Ishiguro, “Motioncompensated interframe coding for video conferencing,” in Proc. NTC, pp. C9.6.1– 9.6.5, Nov. 1981. [44] L.M. Po andW.C. Ma, “A new center-biased search algorithm for block motion estimation,” IEEE Trans. Image Processing, pp. 23–26, Oct. 1995. [45] S. Zhu and K.K. Ma, “A new diamond search algorithm for fast block matching motion estimation,” Information, Communications and Signal Processing, pp. 9–12, Sept. 1997. [46] Boyce and J. M, “Weighted prediction in the H.264/MPEG AVC video coding standard,” May 2004, vol. 3, pp. 789–792. [47] Y. Chen and C. Cai J. Chen, “Ni luminance and chrominance correction for multi-view video using simplified color error model,” in Proceedings of 2006 Picture Coding Symposium, Dec. 2006. [48] K. Kamikura, H. Watanabe, H. Jozawa, and S. Ichinose, “Global brightnessvariation compensation for video coding,” IEEE Trans. Circuits Syst. Video Technol., vol. 8, no. 11, pp. 988–1000, Dec. 1998. [49] ISO/IEC JTC1, “JSVM 3.5 software,” ISO/IEC JTC1/WG11 JVT-P064. 224 [50] S.-H. Kim and R.-H. Park, “Fast local motion-compensation algorithm for video sequences with brightness variations,” IEEE Trans. Circuits Syst. Video Technol., vol. 13, no. 4, pp. 289–299, Apr. 2003. [51] ISO/IEC JTC1, “Luminance and chrominance compensation for multi-view sequences using histogram matching,” ISO/IEC JTC1/SC29/WG11, 2005. [52] U. Fecker, M. Barkowsky, and A. Kaup, “Improving the prediction efficiency for multi-view video coding using histogram matching,” in Proceedings of 2006 Picture Coding Symposium, Apr. 2006. [53] H.-C. Chang, L.-G. Chen, M.-Y. Hsu, and Y.-C. Chang, “Performance analysis and architecture evaluation of MPEG-4 video codec system,” 2000. [54] S.-Y. Chien, S.-H. Yu, L.-F. Ding, Y.-N. Huang, and L.-G. Chen, “Fast disparity estimation algorithm for mesh-based stereo image/video compression with twostage hybrid approach,” in Proceedings of SPIE Visual Communications and Image Processing 2003, 2003. [55] S.-Y. Chien, Video segmentation: algorithms, hardware architectures, and applications, Ph.D. thesis, Nation Taiwain University, Taipei, May 2003. [56] MPEG-4 Video Group, Generic Coding of Audio-Visual Objects: Part 2-Visual 14496-2, ISO/IEC JTC1/SC29/WG11 N2502a, FDIS, Atlantic City, 1998. [57] W. Yang, K.-N. Ngan, and J. Cai, “MPEG-4 based stereoscopic and multiview video coding,” in Proceedings of 2004 International Symposium on Intelligent Multimedia, Video and Speech Processing, 2004, pp. 61–64. [58] Joint Vieo Team (JVT) of ISO/IEC MPEG and ITU-T VCEG, Advanced Video Coding for Generic Audiovisual Services, Joint Vieo Team (JVT) of ISO/IEC MPEG and ITU-T VCEG, May 2003. [59] J. N. Ellinas and M. S. Sangriotis, “Stereo video coding based on interpolated motion and disparity estimation,” in Proceedings of the 3rd International Symposium on Image and Signal Processing and Analysis, 2003, pp. 301–306. 225 [60] Y. Luo, Z. Zhang, and P. An, “Stereo video coding based on frame estimation and interpolation,” IEEE Trans. Broadcast., vol. 49, no. 1, pp. 14–21, Mar. 2003. [61] H.-Z. Jia, W. Gao, and Y. Lu, “Stereoscopic video coding based on global displacement compensated prediction,” in International Conference on Information and Communications Security, 2003, pp. 61–65. [62] MPEG-4 Group, The MPEG-4 Video Standard Verification Model version 18.0, ISO/IEC JTC 1/SC 29/WG11 N3908, 2001. [63] S. Cho, K. Yun, B. Bae, Y. Hahm, C. Ahn, Y. Kim, K. Sohn, and Y. h. Kim, Report for EE3 in MPEG 3DAV, ISO/IEC JTC1/SC29/WG11 M9186, December 2002. [64] C. Zhu, X. Lin, and L.P. Chau, “Hexagon-based search pattern for fast block motion estimation,” IEEE Transaction on Circuit and System for Video Technology, pp. 349–355, May 2002. [65] B.-C. Song and K.-W. Chun, “Multi-resolution block matching algorithm and its vlsi architecture for fast motion estimation in an mpeg-2 video encoder,” IEEE Trans. Circuits Syst. Video Technol., vol. 12, no. 9, pp. 1119–1137, 2004. [66] W.-M. Chao, C.-W. Hsu, Y.-C. Chang, and L.-G. Chen, “A novel hybrid motion estimator supporting diamond search and fast full ssearch,” in IEEE International Symposium on Circuits and Systems, 2002. [67] J. C. Tuan, T. S. Chang, and C. W. Jen, “On the data reuse and memory bandwidth analysis for full-search block-matching vlsi architecture,” IEEE Trans. Circuits Syst. Video Technol., vol. 12, no. 1, pp. 61–72, Jan. 2002. [68] L.-F. Ding, S.-Y. Chien, and L.-G. Chen, “Algorithm and architecture of prediction core in stereo video hybrid coding system,” in Proceedings of 2005 IEEE Workshop on Signal Processing Systems, 2005. [69] Tung-Chien Chen, Yu-Wen Huang, and Liang-Gee Chen, “Analysis and design of macroblock pipelining for H.264/AVC VLSI architecture,” Circuits and Sys226 tems, 2004. ISCAS ’04. Proceedings of the 2004 International Symposium on, vol. 2, pp. II–273–6 Vol.2, 23-26 May 2004. [70] Alexis Michael Tourapis, Oscar C. Au, and Ming Lei Liou, “Predictive motion vector field adaptive search technique (PMVFAST) - enhancing block based motion estimation,” in Proceedings of Visual Communications and Image Processing 2001 (VCIP’01), 2001. [71] Alexis M. Tourapis, “Enhanced predictive zonal search for single and multiple frame motion estimation,” 2002, vol. 4671, pp. 1069–1079, SPIE. [72] D. Tzovaras, M. G. Strintzis, and H. Sahinolou, “Evaluation of multiresolution block matching techniques for motion and disparity estimation,” Signal Processing: Image Commun., vol. 6, pp. 56–67, 1994. [73] C. J. Duanmu, M. O. Ahmad, and M. N. S. Swamy, “8-bit partial sum of 16 luminance values for fast block motion estimation,” in Proc. of IEEE Int. Conf. Multimedia Expo (ICME’03), 2003, pp. 689–692. [74] V. Iverson, J. McVeigh, and B. Reese, “Real-time H.264/AVC codec on Intel architectures,” in IEEE International Conference on Image Processing. Singapore, 2004, pp. 757–760. [75] Jen-Chieh Tuan, Tian-Sheuan Chang, and Chein-Wei Jen, “On the data reuse and memory bandwidth analysis for full-search block-matching VLSI architecture,” Circuits and Systems for Video Technology, IEEE Transactions on, vol. 12, no. 1, pp. 61–72, Jan 2002. [76] C.-Y. Chen, C.-T. Huang, Y.-H. Chen, and L.-G. Chen, “Level C+ data reuse scheme for motion estimation with corresponding coding orders,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 16, no. 4, pp. 553– 558, Apr. 2006. [77] Chuan-Yuan Tsai, Chen-Han Chung, Yu-Han Chen, Tung-Chien Chen, and Liang-Gee Chen, “Low power cache algorithm and architecture design for fast 227 motion estimation in H.264/AVC encoder system,” Acoustics, Speech and Signal Processing, 2007. ICASSP 2007. IEEE International Conference on, vol. 2, pp. II–97–II–100, 15-20 April 2007. [78] et al. Y.-W. Huang, “A 1.3TOPS H.264/AVC single-chip encoder for hdtv applications,” Feb. 2005, pp. 128–129. [79] Hansoo Kim and In-Cheol Park, “High-performance and low-power memoryinterface architecture for video processing applications,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 11, no. 11, pp. 1160–1170, Nov. 2001. [80] Tetsuro Takizawa and Masao Hirasawa, “An efficient memory arbitration algorithm for a single chip MPEG2 AV decoder,” IEEE Transactions on Consumer Electronics, vol. 47, no. 3, pp. 660–665, Aug. 2001. [81] A.J. Smith, “Cache Memories,” ACM Computing Surveys (CSUR), vol. 14, no. 3, pp. 473–530, 1982. [82] T.F. Chen and J.L. Baer, “A performance study of software and hardware data prefetching schemes,” Computer Architecture, 1994. Proceedings the 21st Annual International Symposium on, pp. 223–232, 1994. [83] J.L. Baer and T.F. Chen, “An effective on-chip preloading scheme to reduce data access penalty,” Proceedings of the 1991 ACM/IEEE conference on Supercomputing, pp. 176–186, 1991. [84] JWC Fu, JH Patel, and BL Janssens, “Stride Directed Prefetching In Scalar Processors,” Microarchitecture, 1992. MICRO 25., Proceedings of the 25th Annual International Symposium on, pp. 102–110, 1992. [85] Yu-Wen Huang, Tung-Chien Chen, Chen-Han Tsai, Ching-Yeh Chen, To-Wei Chen, Chi-Shi Chen, Chun-Fu Shen, Shyh-Yih Ma, Tu-Chih Wang, Bing-Yu Hsieh, Hung-Chi Fang, and Liang-Gee Chen, “A 1.3TOPS H.264/AVC singlechip encoder for HDTV applications,” Solid-State Circuits Conference, 2005. 228 Digest of Technical Papers. ISSCC. 2005 IEEE International, pp. 128–588 Vol. 1, 10-10 Feb. 2005. [86] Zhenyu Liu, Yang Song, Ming Shao, Shen Li, Lingfeng Li, S. Ishiwata, M. Nakagawa, S. Goto, and T. Ikenaga, “A 1.41W H.264/AVC real-time encoder SOC for HDTV1080p,” VLSI Circuits, 2007 IEEE Symposium on, pp. 12–13, 14-16 June 2007. [87] Y. K. Lin, D. W. Li, C. C. Lin, T. Y. Kuo, S. J. Wu, W. C. Tai, W. C. Chang, and T. S. Chang, “A 242mW 10mm2 1080p H.264/AVC high-profile encoder chip,” in IEEE C ISSCC, 2008. [88] T.-C. Chen, Y.-W. Huang, C.-Y. Tsai, C.-T. Huang, and L.-G. Chen, “Single reference frame multiple current macroblocks scheme for multi-frame motion estimation in H.264/AVC,” May 2005, pp. 1790–1793. [89] J.-C. Tuan, T.-S. Chang, and C.-W. Jen, “On the data reuse and memory bandwidth analysis for full-search block-matching VLSI architecture,” IEEE Trans. Circuits Syst. Video Technol., vol. 12, no. 1, pp. 61–72, Jan. 2002. [90] C.-Y. Chen, C.-T. Huang, Y.-H. Chen, and L.-G. Chen, “Level C+ data reuse scheme for motion estimation with corresponding coding orders,” IEEE Trans. Circuits Syst. Video Technol., vol. 16, no. 4, pp. 553–558, Apr. 2006. [91] Joint Video Team of ISO/IEC MPEG and ITU-T VCEG, “Joint draft 7.0 on multiview video coding,” ISO/IEC JTC1/SC29/WG11 and ITU-T SG16 Q.6, Apr. 2008. [92] P. G. J. Barten, “Evaluation of subjective image quality with the square-root integral method,” Journal of the Optical Society of America A, vol. 7, pp. 2024– 2031, Oct. 1990. [93] et al. Y.-K. Lin, “A 242mW 10mm2 1080p H.264/AVC High-Profile encoder chip,” Feb. 2008, pp. 314–315. [94] et al. H.-C. Chang, “A 7mW to 183mW dynamic quality-scalable H.264 video encoder chip,” Feb. 2007, pp. 128–129. 229 [95] Y.-W. Huang and et al., “Analysis, fast algorithm, and VLSI architecture design for H.264/AVC intra frame coder,” IEEE Trans. on CSVT, vol. 15, no. 3, pp. 378–401, Mar. 2005. [96] C.-W. Ku and et al., “A high-definition H.264/AVC intra-frame codec IP for digital video and still camera applications,” IEEE Trans. on CSVT, vol. 16, no. 8, pp. 917–928, Aug. 2006. [97] K. Suh, S. Park, and H.Cho, “An efficient hardware architecture of intra prediction and TQ/IQIT module for H.264 encoder,” ETRI Journal, vol. 27, 2005. [98] Y.-W. Huang and et al., “A 1.3tops H.264/AVC single-chip encoder for HDTV applications,” in Proc. of IEEE ISSCC, 2005, pp. 128–588. [99] D. Marpe, H. Schwarz, and T. Wiegand, “Context-based adaptive binary arithmetic coding in the H.264/AVC video compression standard,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 13, no. 7, pp. 620–644, July 2003.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/9417	-
dc.description.abstract	三維立體視訊能夠藉由同時將不同視角的影像傳送至左右兩眼，進而提供完整的場景實境效果。在一個三維視訊系統中，由於有大量的資料量必須做有效的壓縮處理，多視角視訊編碼因此伴演了很重要的角色。在本篇論文中，我們將從演算法、硬體架構及系統設計等三個不同的層面來研究多視角視訊編碼技術。在演算法的相關研究中，我們針對三維視訊編碼系統中的預測核心部分以及彩度校正部分分別提出了快速演算法。所謂的預測核心，包含了程動估計和位移估計兩種運算，分別可以消除時間和空間上的多餘資料量。然而，預測核心需要大量的運算量，造成軟硬體設計上的困難，因此，我們提出了內容感知預測演算法降低運算複雜度。運用位移估計先在不同視角間找到相對應的方塊其內容包含同樣物件內容，我們可以從已編碼後的視角影像中，有效地取得編碼資訊，如此可以在大多數的視角影像頻道中節省98.4—99.1%的運算量，伴隨而來的代價只有0.03—0.06dB的PSNR下降量。此外，我們也針對了不同視角間亮度及彩度不對稱的問題，提出位移補償線性回歸的亮度及彩度校正演算法，同時同覆利用了移動向量的資訊，提升了0.4dB的編碼效率。在多視角視訊編碼中，預測核心向來是運算複雜度最高的部分，如此的運算量，無法利用現今的處理器達成即時編碼的需求，因此利用硬體加速有其必要性。在論文的第二部分，我們分別針對雙視角以及多視角視訊編碼系統提出了兩個預測演算法及其對應的硬體架構。首先，我們提出了聯合預測演算法應用在雙視角視訊編碼系統中，以增加編碼效率以及降低運算量。聯合預測演算法利用了雙視角影像的特性，成功降低了80%的運算量，同時也增進了影像的品質。我們針對此演算法所設計以及實作的晶片大小約2.13×2.13 mm2，包含了137K個邏輯閘以及20.75Kbits的內嵌記憶體。與傳統的架構相較，此架構只使用了11.5%的內嵌記憶體以及3.3%的處理單元即可達到相同的規格。接著，我們將研究的廣度由雙視角擴展為多視角系統，我們改進了移動估計中的向量中心預測演算法，以增進移動向量的準確度。該演算法節省了96%的運算複雜度，同時視訊品質只降低了0.045dB。所提出的演算法的搜尋範圍在水平與垂直方向最大可支援到[-256, +255]/[-256, +255]，因此可以滿足四倍大高解析度影像的需求。我們以快取處理的概念設計相對應的硬體架構，利用快取的機制可以有效地節省39%的外部記憶體頻寬。我們所實作的晶片只需230K個邏輯閘以及8KB的記憶體，在300MHz的操作頻率下，可以即時的處理四倍大高解析度的視訊影像。接著，論文的第三部分針對多視角編碼系統提出了頻寬分析的方法以及單晶片編碼器的硬體架構。首先，我們針對各種不同且複雜的多視角編碼結構提出了一套新的系統頻寬分析方法，用圖論中的「次序限制」處理多視角編碼結構中影像間的預測相依性。根據此分析，我們即可針對不同的編碼結構達到硬體資源的最適分配。接下來，我們提出了多視角視訊編碼器的系統架構，可應用於未來三維立體電視和四倍高畫質解析度電視中。其中包含了一個新的八級巨集區塊管線架構以及系統排程，並且搭配快取式預測核心，能夠即時的處理單視角4096×2160p、三視角1920×1080p、到七個視角1280×720p的編碼運算。此外，為了維持高解析度影像的畫質，本架構可支援的搜尋範圍是傳統架構的4—64倍大。與傳統架構相比，本架構可以節省79%的系統頻寬和94%的內嵌記憶體。我們所實作的晶片利用CMOS 90奈米的製程，大小為11.46 mm2，藉由新的系統排程、高度的平行化、演算法最佳化、以及模組化的時脈閘控等技術，本晶片能夠達到1億1200萬像素/秒的運算能力。本晶片為世界上第一顆多視角視訊單晶片編碼器。簡而言之，我們對多視角編碼技術的貢獻主要有三個方向。我們所提出演算法可運用在多視角編碼系統的核心；此外，我們也針對雙視角以及多視角視訊編碼的預測核心率先提出了硬體架構以及晶片；並且，整合了論文前兩部分，我們所提出的多視角視訊編碼系統為世界上最前瞻的設計。利用我們提出的多視角編碼技術，三維立體視訊便能夠實現在許多實際的生活應用中，我們由衷地希望我們的研究成果能對人類生活的便利性帶來劃時代的突破。	zh_TW
dc.description.abstract	Three-dimensional (3D) video can provide a complete scene structure to human eyes by transmitting several views simultaneously. In 3D video systems, multiview video coding (MVC) plays a critical role because huge amount of data make compression necessary. In this dissertation, MVC is studied at three different levels: algorithm level, VLSI architecture level, and system design level. According to the research relevance between hardware-oriented algorithms and VLSI architectures, several proposed algorithms and their corresponding architectures are discussed together. Therefore, this dissertation is divided into three parts: efficient MVC algorithms, algorithm and architecture co-design of MVC prediction core, and system analysis and architecture design of MVC encoder. In the first part of this dissertation, two new fast algorithms are developed for prediction core and color correlation. The prediction core mainly consists of motion estimation (ME) and disparity estimation (DE) to remove temporal and inter-view redundancy, respectively. We develop content-aware prediction algorithm (CAPA) for computational complexity reduction. By utilizing DE to find corresponding blocks between different views, the coding information can be effectively shared and reused from the coded view channel. It can save 98.4--99.1% computational complexity of ME in most view channels with negligible quality loss of only 0.03--0.06 dB in PSNR. In addition, we also develop an illuminance and chrominance correlation algorithm to deal with the color mismatch between views. The proposed algorithm is based on motion compensated linear regression which reuses the motion information from the encoder and provides better coding gain by up to 0.4 dB. Prediction core is always the most computation-intensive part in an MVC system. Therefore, hardware acceleration is necessary for real-time processing requirement. In the second part of this dissertation, two prediction algorithms and their corresponding VLSI architectures are proposed for stereo and MVC, respectively. First, we propose joint prediction algorithm (JPA) for high coding efficiency and low computational complexity. JPA utilizes the characteristics of stereo video and successfully reduces about 80% computational complexity while enhances the video quality. A corresponding architecture of JPA is then designed and implemented on a 2.13x2.13mm^2 die with only 137K logic gates and 20.75 Kbits on-chip SRAM. The JPA architecture is an area-efficient design because only 11.5% on-chip SRAM and 3.3% processing elements are used compared with conventional architectures. Second, we extend the design space from stereo video to MVC systems. A predictor-centered prediction algorithm is modified to enhance the motion vector (MV) accuracy. 96% computational complexity can be saved with the penalty of quality loss of only 0.045dB. The proposed algorithm supports search range up to [-256,+255]/[-256, +255] in horizontal/vertical directions, so it can fulfill the requirement of quad full high-definition (QFHD) videos. A corresponding architecture is then presented based on the cache processing concept. The proposed cache architecture saves 39% off-chip memory bandwidth with a rapid prefetching algorithm. The implementation results show that the proposed cache-based prediction architecture requires little hardware cost: only 230K logic gates and 8KB on-chip SRAM (2.1x2.1mm^2 with 90nm process) are required to process QFHD videos in real-time at the working frequency of 300 MHz. The third part of this dissertation describes system analysis and architecture design of MVC encoder. The accumulated know-how and design experience of the previous parts are integrated. First, a new system bandwidth analysis scheme is proposed for various and complicated MVC structures. The precedence constraint in the graph theory is adopted for deriving the processing order of frames in an MVC system. The suitable hardware resource allocation can be easily determined with the proposed analysis scheme. Second, the system architecture of MVC encoder for 3D/QFHD applications is presented. An eight-stage macroblock pipelined architecture with proposed system scheduling and cache-based prediction core supports real-time processing from one-view 4096x2160p to seven-view 720p videos. In addition, the search range is four to sixty-four times larger than the previous work to maintain HD video quality. The proposed architecture saves 79% system memory bandwidth and 94% on-chip SRAM. A prototype chip is implemented on a 11.46mm^2 die with 90nm CMOS technology. The 212M pixels/s throughput and 407M pixels/Watt are achieved by proposed system scheduling, high degree of parallelism to reduce memory access, algorithmic optimization, and module-wise clock gating, etc. The proposed architecture is the worldwide first reported MVC single-chip encoder. In brief, we believe that with the MVC technologies proposed in this dissertation, 3D video processing can be realized in many real applications. We sincerely hope that our research contributions can create a new era for digital multimedia life.	en
dc.description.provenance	Made available in DSpace on 2021-05-20T20:21:38Z (GMT). No. of bitstreams: 1 ntu-97-D94943018-1.pdf: 7913053 bytes, checksum: 5481004ee8f613b31b855dee6add5794 (MD5) Previous issue date: 2008	en
dc.description.tableofcontents	Abstract xix 1 Introduction 1 1.1 Trends of 2D and 3D Television Systems . . . . . . . . . . . . . . . 1 1.2 Development of 2D and 3D Video Coding Standards . . . . . . . . 5 1.2.1 2D Video Coding: From MPEG-1, H.261 to MPEG-4/H.264 5 1.2.2 MPEG-2 Multiview Profile (MVP) . . . . . . . . . . . . . . 6 1.2.3 MPEG-C Part3/Advanced Three-Dimensional Television System Technologies (ATTEST) . . . . . . . . . . . . . . . . . 8 1.2.4 H.264/AVC Multiview High Profile . . . . . . . . . . . . . 9 1.3 Research Topics and Contributions . . . . . . . . . . . . . . . . . . 10 1.3.1 Algorithms of Multiview Video Coding . . . . . . . . . . . 12 1.3.2 VLSI Architectures of Multiview Video Coding . . . . . . . 13 1.3.3 System Design of Multiview Video Coding . . . . . . . . . 14 1.4 Dissertation Organization . . . . . . . . . . . . . . . . . . . . . . . 15 I Efficient Algorithms for Multiview Video Coding 17 2 Content-Aware Prediction Algorithm with Inter-View Mode Decision for Multiview Video Coding 19 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 2.2 Analysis of Macroblock Prediction Modes in MVC . . . . . . . . . 23 2.3 Proposed Content-Aware Prediction Algorithm (CAPA) With Inter- View Mode Decision . . . . . . . . . . . . . . . . . . . . . . . . . 29 2.3.1 System Architecture . . . . . . . . . . . . . . . . . . . . . 30 2.3.2 Disparity Estimation and Selection of Twin-Macroblock . . 31 2.3.3 Inter-View Mode Decision . . . . . . . . . . . . . . . . . . 32 2.3.4 Content-Aware Motion Estimation . . . . . . . . . . . . . . 35 2.4 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . 39 2.4.1 Quality and Complexity Analysis of Inter-View Mode Decision 42 2.4.2 Quality and Complexity Analysis of Content-Aware Motion Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . 44 2.4.3 Rate-Distortion Performance of Content-Aware Prediction Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 2.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 3 Fast Luminance and Chrominance Correction Based on Disparity Compensated Linear Regression for Multiview Video Coding 51 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 3.2 Proposed Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 53 3.2.1 Conventional Encoding Process with Weighted Prediction . 53 3.2.2 Problem Definition . . . . . . . . . . . . . . . . . . . . . . 54 3.2.3 Linear Regression With Reused Disparity Information . . . 55 3.2.4 Computational Cost and Bandwidth Overhead . . . . . . . . 57 3.3 Simulation Result . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 3.3.1 Environment Setup . . . . . . . . . . . . . . . . . . . . . . 58 3.3.2 Initial Correction Models . . . . . . . . . . . . . . . . . . . 59 3.3.3 Simulation Result . . . . . . . . . . . . . . . . . . . . . . . 59 3.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 II Algorithm and Architecture Co-Design of Prediction Core for 3D/Quad High Definition Video Coding 65 4 Joint Prediction Algorithm and Architecture for Stereo Video Hybrid Coding Systems 67 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 4.2 Proposed Joint Prediction Algorithm . . . . . . . . . . . . . . . . . 70 4.2.1 Joint Block Compensation . . . . . . . . . . . . . . . . . . 72 4.2.2 MV-DV Prediction . . . . . . . . . . . . . . . . . . . . . . 77 4.2.3 Mode Pre-decision . . . . . . . . . . . . . . . . . . . . . . 79 4.2.4 Experimental Analysis and Comparison . . . . . . . . . . . 80 4.3 Block Matching Algorithm for ME/DE . . . . . . . . . . . . . . . . 84 4.3.1 Modified Hierarchical ME/DE Block Matching Algorithm . 84 4.3.2 Near-Overlapped Candidates Reuse Scheme (NOCRS) . . . 85 4.4 Prediction Core Architecture . . . . . . . . . . . . . . . . . . . . . 89 4.4.1 RSRN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 4.4.2 128-PE Adder Tree . . . . . . . . . . . . . . . . . . . . . . 95 4.4.3 NOCRC . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 4.4.4 JBG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 4.4.5 Memory Organization and Data Reuse Scheme . . . . . . . 96 4.4.6 Proposed Scheduling for Stereo Video Coding System . . . 97 4.4.7 Chip Implementation . . . . . . . . . . . . . . . . . . . . . 97 4.4.8 Comparisons . . . . . . . . . . . . . . . . . . . . . . . . . 99 4.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 5 Analysis and Algorithm Design of High-throughput Prediction Core for 3D/Quad HDTV Videos 103 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 5.2 Design Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . 105 5.2.1 High Computational Complexity . . . . . . . . . . . . . . . 105 5.2.2 High External Memory Bandwidth and Large On-Chip Memory Size . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 5.3 Predictor-Centered Search Block Matching Algorithm . . . . . . . . 108 5.3.1 Inter Predictor . . . . . . . . . . . . . . . . . . . . . . . . 108 5.3.2 Intra Predictor . . . . . . . . . . . . . . . . . . . . . . . . 109 5.4 Proposed Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . 110 5.4.1 Motion Information Preserving . . . . . . . . . . . . . . . . 110 5.4.2 Data Dependency and Regularity of Access Pattern . . . . . 111 5.4.3 Variable-Block-Size Data Reuse . . . . . . . . . . . . . . . 113 5.4.4 Cache as Search Range Buffer . . . . . . . . . . . . . . . . 114 5.5 Simulation Result . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 5.5.1 Analysis of Refinement Range . . . . . . . . . . . . . . . . 116 5.5.2 Reduction of Computational Complexity . . . . . . . . . . 119 5.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 6 Analysis and Architecture Design of Cache-Based Prediction Core for 3D/Quad HDTV Videos 123 6.1 Introduction to Generic Cache System . . . . . . . . . . . . . . . . 124 6.1.1 Two Dimensional Cache Organization . . . . . . . . . . . . 127 6.2 Design Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . 129 6.2.1 Cache Miss . . . . . . . . . . . . . . . . . . . . . . . . . . 130 6.2.2 Overhead of Data Prefetching . . . . . . . . . . . . . . . . 131 6.3 Access Pattern and Throughput Requirement . . . . . . . . . . . . 132 6.3.1 Access Pattern of IME . . . . . . . . . . . . . . . . . . . . 132 6.3.2 Access Pattern of FME . . . . . . . . . . . . . . . . . . . . 133 6.4 Proposed Prefetching Algorithm . . . . . . . . . . . . . . . . . . . 135 6.4.1 Rapid Prefetching Pattern . . . . . . . . . . . . . . . . . . 135 6.4.2 Concurrent Reading and Prefetching . . . . . . . . . . . . . 136 6.4.3 Priority-based Replacement Policy . . . . . . . . . . . . . . 137 6.4.4 Integration with Macro-Block Pipeline . . . . . . . . . . . 140 6.5 Proposed Architecture for Cache Design . . . . . . . . . . . . . . . 140 6.5.1 Overall Architecture . . . . . . . . . . . . . . . . . . . . . 141 6.5.2 Non-blocking Cache Refill . . . . . . . . . . . . . . . . . . 145 6.5.3 Concurrent Reading and Prefetching . . . . . . . . . . . . . 149 6.6 Cache Profiling . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 6.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152 III System Analysis and Architecture Design of 4096x2160p Multivew Video Single-Chip Encoder 157 7 System Bandwidth Analysis of Multiview Video Coding with Precedence Constraint 159 7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159 7.2 Previous Data Reuse Schemes for Mono-View Video Coding Systems 161 7.3 Proposed System Analysis Scheme with Precedence Constraint . . . 162 7.3.1 System Architecture of Multiview Video Coding Systems . 163 7.3.2 System Bandwidth Analysis with Precedence Constraint . . 164 7.4 Case Studies and Performance Evaluation . . . . . . . . . . . . . . 167 7.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172 8 System Architecture of a 4096x2160p Multivew Video Single-Chip Encoder 173 8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175 8.2 Proposed View-Parallel Macroblock-Interleaved Scheduling . . . . 177 8.3 Cache-Based Temporal/Inter-view Prediction Stage . . . . . . . . . 182 8.3.1 Cache Controller Architecture . . . . . . . . . . . . . . . . 182 8.3.2 Predictor-Centered Algorithm and Architecture for IMDE Stage182 8.3.3 Algorithm and Architecture Optimization for FMDE Stage . 184 8.4 Hybrid Open-Closed Loop Intra Prediction and Pixel-Forwarding Reconstruction Stage . . . . . . . . . . . . . . . . . . . . . . . . . . . 193 8.4.1 Design Challenges . . . . . . . . . . . . . . . . . . . . . . 193 8.4.2 Hardware-Oriented Hybrid Open-Closed Loop Intra Prediction193 8.4.3 Architecture of Hybrid Open-Closed Loop Intra Prediction and Pixel-Forwarding Reconstruction . . . . . . . . . . . . 196 8.5 Frame-Parallel Pipeline-Doubled Dual Context-Based Adaptive Binary Arithmetic Coding Stage . . . . . . . . . . . . . . . . . . . . 199 8.5.1 Introduction to Context-Based Adaptive Binary Arithmetic Coding (CABAC): Algorithm and Architecture . . . . . . . 199 8.5.2 Analysis of CABAC Symbol Rate . . . . . . . . . . . . . . 201 8.5.3 Frame-Parallel Pipeline-Doubled Dual CABAC Architecture 202 8.6 Implementation Results . . . . . . . . . . . . . . . . . . . . . . . . 204 8.6.1 Chip Implementation . . . . . . . . . . . . . . . . . . . . . 204 8.6.2 Chip Comparison . . . . . . . . . . . . . . . . . . . . . . . 205 8.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208 9 Conclusion 211 9.1 Principal Contributions . . . . . . . . . . . . . . . . . . . . . . . . 211 9.1.1 Efficient Algorithm Development for Multiview Video Coding212 9.1.2 Algorithm and Architecture Co-Design of Prediction Core for 3D/Quad High Definition Video Coding . . . . . . . . . 213 9.1.3 System Analysis and Architecture Design of 4096£2160p Multivew Video Single-Chip Encoder . . . . . . . . . . . . 214 9.2 Future Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . 215 9.2.1 Algorithm Level . . . . . . . . . . . . . . . . . . . . . . . 215 9.2.2 Architecture Level . . . . . . . . . . . . . . . . . . . . . . 216 9.2.3 System Level . . . . . . . . . . . . . . . . . . . . . . . . . 217 Bibliography 219
dc.language.iso	en
dc.title	多視角視訊編碼技術之演算法和硬體架構及系統設計之研究	zh_TW
dc.title	Multiview Video Coding: Algorithms, VLSI Architectures, and System Design	en
dc.type	Thesis
dc.date.schoolyear	97-1
dc.description.degree	博士
dc.contributor.coadvisor	簡韶逸(Shao-Yi Chien)
dc.contributor.oralexamcommittee	杭學鳴(Hsueh-Ming Hang),陳永昌(Yung-Chang Chen),李鎮宜(Chen-Yi Lee),楊家輝(Jar-Ferr Yang),吳安宇(An-Yeu Wu),陳美娟(Mei-Juan Chen),賴永康(Yeong-Kang Lai)
dc.subject.keyword	視訊壓縮,硬體架構,H.264/AVC,三維立體,多視角視訊,	zh_TW
dc.subject.keyword	Video coding,VLSI architecture,H.264/AVC,Multiview video coding,	en
dc.relation.page	229
dc.rights.note	同意授權(全球公開)
dc.date.accepted	2009-02-03
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	電子工程學研究所	zh_TW
顯示於系所單位：	電子工程學研究所

文件中的檔案：

檔案	大小	格式
ntu-97-1.pdf	7.73 MB	Adobe PDF	檢視/開啟

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。