以HVS為基礎之2D-to-3D即時視訊轉換演算法及架構設計

Jhe-Yi Lin; 林哲毅

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/45662

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	曹恆偉(Hen-Wai Tsao)
dc.contributor.author	Jhe-Yi Lin	en
dc.contributor.author	林哲毅	zh_TW
dc.date.accessioned	2021-06-15T04:33:27Z	-
dc.date.available	2012-08-20
dc.date.copyright	2009-08-20
dc.date.issued	2009
dc.date.submitted	2009-08-19
dc.identifier.citation	[1] P. Hohenstatt, L. D. Vinci, 1998. [2] C. Wheatstone , “Contributions to the Physiology of Vision.—Part the First. On some remarkable, and hitherto unobserved, Phenomena of Binocular Vision”, Philosophical Transactions, Vol. 128, pp. 371 - 394 [3] L. M. J. Meesters, W. A. Ijsselsteijn, and P. J. H. Seuntiens, “A Survey of Perceptual Evaluations and Requirements of Three-Dimensional TV,” IEEE Transactions on Circuits and Systems for Video Technology, Vol. 14, No. 3, pp. 381-391, Mar. 2004. [4] D. N. Liu, Y. H. Yeh, and H. L. Chiou, “Development of Low Temperature p-Si TFT-LCD,” The 17th Annual Meeting of The IEEE Lasers and Electro-Optics Society, 2004, Vol. 1, pp. 182-183, Nov. 2004. [5] Y. Ukai, “TFT-LCD Manufacturing Technology – Current Status and Feature Prospect –,” IEEE International Workshop on Physics of Semiconductor Devices, 2007 (IWPSD 2007), pp. 29-34, Dec. 2007. [6] 陳志強，LTPS低溫複晶矽顯示器技術，全華科技圖書股份有限公司，2004. [7] 鵜飼育弘，System on panel的技術現狀與將來展望，財團法人光電科技工業協進會，2002. [8] P. Hohenstatt, L. D. Vinci, 1998. [9] Ubersicht, http://www.3d-historisch.de/. [10] http://www.iz3d.com.tw/about.php [11] A. Redert, M. Beeck, C. Fehn, W. Ijsselsteijn, M. Pollyfeys, L. Gool, E. Ofek, I. Sexton, and P. Surman, “ATTEST: Advanced Three-Dimensional Television System Technologies,” IEEE International Symposium on 3D Data Processing Visualization and Transmission, pp. 313-319, Jan. 2002. [12] C. Fehn, K. Hopf, and Q. Quanta, “Key Technologies for an Advanced 3D-TV System,” in Proceedings of SPIE Three-Dimensional TV Video and Display III, pp. 66-80, Oct. 2004. [13] http://www.samsung.com/us/business/semiconductor/newsList.do. [14] C. Fehn, K. Hopf, and Q. Quanta, “Key Technologies for an Advanced 3D-TV System,” in Proceedings of SPIE Three-Dimensional TV Video and Display III, pp. 66-80, Oct. 2004. [15] 3DV Systems, http://www.3dvsystems.com/. [16] M. Kawakita, K. Lizuka, T. Aida, H. Kikuchi, H. Fujikake, J. Yonai, and K. Takizawa, “AXI-vision camera (Real-time distance-mapping camera),” Appl. Opt., Vol. 39, pp. 3931-3939, 2000. [17] M. Kawakita, T. Kurita, H. Kikuchi, and S. Inoue, “HDTV AXI-vision camera,” in Proc. IBC, pp. 397-404, 2002. [18] http://news.xinhuanet.com/st/2005-11/22/content_3815726.htm [19] http://home.educities.edu.tw/green67/picture/picture8-2.jpg [20] I. Ideses, L. P. Yaroslavsky, B. Fishbain, “Real-time 2D to 3D Conversion”, Journal of Real-Time Image Processing, 2007 – Springer [21] A Redert, R. P. Berretty, C Varekamp, O Willemsen, J, “Philips 3D solution:from content creation to visualization”, Third International Symposium on 3D Data Processing, Visualization and Transmission, 2006 [22] C. C. Cheng, P. S. H. C. T. Li, T. K. Lin, Y. M. Tsai, L. G. Chen, “A Block-Based 2D-to-3D Conversion System with Bilateral Filter”, International Conference on Consumer Electronics (ICCE), Las Vegas, NV, USA, Jan. 2009 [23] C. C. Cheng, Y. M. Tsai, L. G. Chen, ” Hybrid depth cueing for 2D-to-3D conversion system”, Proc. SPIE, Vol. 7237 , San Jose, CA, USA, February 2009 [24] S. H. Lai, C. W. Fu, and S. Chang, 'A generalized depth estimation algorithm with a single image,' PAMI, Vol.14(4), pp. 405-411, 1992. [25] J. Ens, & P. Lawrence, 'An investigation of methods for determining depth from focus,' PAMI, Vol.15(2), pp. 97-108, 1993. [26] S. A. Valencia & R. M. R. Dagnino, 'Synthesizing stereo 3D views from focus cues in monoscopic 2D images,' Stereoscopic Displays and Virtual Reality Systems X, Vol. 5006, pp. 377-388, 2003. [27] C. Swain, A. Peters, and K. Kawamure, “Depth Estimation from Image Defocus using Fuzzy Logic”, IEEE , pp 94- 99, vol. 1 1994. [28] F. Deschenes, D. Ziou, P. Fuchs, “Homotopy-based Estimation of Depth Cues in Spatial Domain”, IEEE, vol.3, pp627-630, Aug, 2002 [29] Y. J. Jung, A. Baik, J. Kim, D. Park , “A novel 2D-to-3D Conversion Technique based on relative height-depth cue”, Proceedings of SPIE, 2009. [30] W. J. Tam, A. Soung Yee, J. Ferreira, S. Tariq, and F. Speranza, 'Stereoscopic image rendering based on depth maps created from blur and edge information,' Stereoscopic Displays and Applications XII, Vol. 5664, pp.104-115, 2005. [31] P Harman, J Flack, S Fox, M Dowley , “Rapid 2D-to-3D Conversion”, Proc. SPIE, 2002 [32] A. N. Rajagopalan and S. Chaudhuri, “An MRF model based approach to simultaneous recovery of depth and restoration from defocused images”, PAMI, Vol. 21, No. 7, pp. 577-89, July 1999. [33] D. A. Pollen, B. W. Andrews, S.E. Feldon, Spatial frequency selectivity of periodic complex cells in the visual cortex of the cat, Vision Res.,18：665∼682,1978. [34] http://vision.berkeley.edu/roordalab/Pubs/EISTChapterRoorda.pdf [35] R. K. Young, “Wavelet theory and its application”, 1993 [36] I Daubechies, ” Orthonormal bases of compactly supported wavelets”, Communications on pure and applied mathematics, 1988 [37] S. G. Malla, “A theory for multiresolution signal decomposition: the waveletrepresentation”, ieee transactions on pattern analysis and machine intelligence, 1989 [38] W. Sweldens, “The lifting scheme: A custom-design construction of biorthogonal wavelets”, Applied and Computational Harmonic Analysis, 1996 [39] A. M. Ramirez, A. D. Sanchez, M. L. Aranda, and J. V. Pineda, “An Architecture for Fractal Image Compression Using Quadtree Multiresolution.”, IEEE International Symposium on Circuits and Systems, vol. 2, pp. 897-900, May 2004 [40] M. Tagliasacchi, M. Sarchi, S. Tubaro, 'Motion Estimation by Quadtree Pruning and Merging,' icme,pp.1861-1864, 2006 IEEE International Conference on Multimedia and Expo, 2006 [41] C. H. Chou and Y. C. Li, “A perceptually tuned subband image coder based on the measure of just-noticeable-distortion profile,” IEEE Trans. Circuits Syst. Video Technol., vol. 5 no. 6, pp. 467-476, Dec. 1995 [42] J. Illingworth, J. Kittler, “A Survey of Hough Transform”, Computer Vision, Graphics, and Image Processing, 1988 [43] J. Canny, “A computational approach to edge detection”, ieee transactions on pattern analysis and machine intelligence, 1986 [44] http://www.physics.nyu.edu/grierlab/idl_html_help/H17.html [45] Q. Wei, “Convertin 2d to 3D : a survey”, International Conference, Dec, 2005. [46] C. Wren, A. Azarbayejani, T. Darrell, and A. Pentland, “Pfinder：Real-Time Tracking of the Human Body,” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 19, No. 7, July 1997, pp. 780-786. [47] W. Long and Y.H. Yang, “Stationary Background Generation：An Alternative to the Difference of Two Images,' Pattern Recognition, Vol. 23, 1990, pp.1351-1359. [48] D.J. Dailey and L. Li, “An Algorithm to Estimate Vehicle Speed Using Un-Calibrated Cameras, Intelligent Transportation Systems,” Proceedings 1999 IEEE/IEEJ/JSAI International Conference, 1999, pp. 441-446. [49] J. Daniel, F. W. Cathey, and S. Pumrin, “An Algorithm to Estimate Mean Traffic Speed Using Uncalibrated Cameras,” Transactions on Intelligent Transportation System, Vol. 1, No. 2, June 2000,pp.98-107. [50] E. Durucan, and T. Ebrahimi, “Improved Linear Dependence and Vector Model for Illumination Invariant Change Detection,” in Proc. of Real-Time Imaging, SPIE’s Photonics West 2001-Electronic Imaging Conf., San Jose, CA, Jan. 2001, pp.21-26. [51] E. Durucan, and T. Ebrahimi, “Change Detection and Background Extraction by Linear Algebra”, Proc. Of IEEE, vol. 89, pp. 1368-1381, Oct 2001. [52] CIC訓練課程 – VHDL [53] CIC訓練課程 – Verilog [54] R. Andraka, “A survey of CORDIC algorithms for FPGAs”, Proceedings of the 1998 ACM/SIGDA sixth international symposium on Field programmable gate arrays, Feb. 22-24, 1998, Monterey, CA. pp191-200 [55] Y. Cheng, “Mean shift, mode seeking, and clustering”, Pattern Analysis and Machine Intelligence, IEEE Transactions on, Vol 17, Aug. 1995 pp790 – 799 [56] http://www.synopsys.com/dw/doc.php/doc/dwt/intro.pdf [57] http://www.mathwoks.com [58] http://www.microsoft.com/taiwan/msdn/vs2005/default.mspx [59] http://drawtiming.sourceforge.net/ [60] http://www/synopsys.com [61] www.cadence.com/products/di/soc_encounter
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/45662	-
dc.description.abstract	本論文提出一個以人類視覺系統為基礎的2D-to-3D即時視訊轉換演算法。其主要的貢獻和創新在於：除了深度感知線索之外，融入更多人類視覺系統的特性，例如，由視力和視角所構成的人眼多解析空間，以及用來代表視覺冗餘的最小可分辨亮度差異變化 (Just-Noticeable-Difference, JND)。融入這些人類視覺系統的特性，除了可以提高主觀上的處理品質之外，也繼承了人類視覺系統高自適性、高一般性等優點。這些優點也正是大部分多媒體系統的努力目標。我們提出的演算法之處理流程是從“以人類視覺系統為基礎之空間分析前處理”開始，對影像的特徵、區域性質、邊緣位置、和方向等作擷取和定位，以利後續處理使用。這個部分主要是由離散小波轉換 (DWT) 和四元樹(Quadtree)分割組成的。小波轉換固定的尺度結合四元樹動態的尺度分析，提供了更完全的影像空間分析。且四元樹分割本質上的特性是對於雜訊和光影的容忍度很高，很適合用來對一般影像作分割處理。四元樹分割的核心是停止分割標準，我們將他和JND結合，並利用人類視覺冗餘的特性，容忍亮度值顫動造成的分析錯誤。我們提出的前處理不只用來擷取影像特徵，也被用來減少後續處理的時間、記憶體使用量、以及提高處理品質和正確性等優點。我們的轉換系統主要有兩個目標，第一是使用線性透視呈現擁有真實比例和位置的場景深度；第二，基於“人類對於移動物件有較高的敏感度”的事實，我們將執行移動物件切割，並依照其在影像中的相對位置、和其他物件的相對大小和關係，賦予個別物體相對的深度。為了達到以上兩個目的，線性透視中的消失線和消失點偵測、以及移動物件切割將接著前處理執行。處理完的結果再給最後一級的“深度圖重建”重建出和輸入影像對應的深度圖。在消失線偵測的部分，我們使用霍夫轉換，並利用小波轉換的金字塔次取樣和Level-of-Detail的特性，減少記憶體使用，並提高消失線和消失點偵測的正確性。在經過功能層級的模擬可知，當我們以hall_monitor_cif.yuv做測試時，我們提出的消失點偵測演算法正確率高達98%，而執行時間和直接對原始影像執行消失線偵測的相比，減少了大概90%。移動物件切割方面，我們使用Wronskian改變偵測子來完成改變偵測。為了解決“Wronskian改變偵測子對於光影變化的高容忍度”和“物件的完整性”無法兼得的問題，我們同樣以多解析度來解決。由DWT的構成的影像金字塔擷取低解析度的物件區域遮罩，再透過四元樹的金字塔結構，將區域遮罩內的區塊依照其在四元樹的層級，給定不同大小的門檻值。實驗結果證實，我們的方法不只可以保持物件的完整性，更可以避免場景光影和物件陰影的變化所造成之錯誤分割。在硬體實作方面，我們遵照標準的cell-based設計流程，並達到在CIF (352x288 pixels)格式下，每秒處理三十張影像的需求。最後，我們的晶片已TSMC 0.13um製程製作成晶片，邏輯閘數目約為4.2M個邏輯閘，操作頻率為54MHz，核心面積為1.572mm x 1.572mm。	zh_TW
dc.description.abstract	In this thesis, we propose a 2D-to-3D video conversion algorithm and architecture based on HVS. We aim at incorporation of the characteristics of HVS and depth perception of HVS, such as the multi-resolution resulted from different viewing angle and visual distance, and Just-Noticeable Difference (JND). Our proposed system has two main goals：Using linear perspective concept to reproduce background depth map, and create foreground depth map based on the fact that human is more sensitive to moving objects. In order to reach our main goals, we utilize vanishing lines detection and vanishing point detection to extract the linear perspective in the image processed. Moving objects segmentation is done by using Wronskian change detection. Furthermore, we propose a HVS-based preprocessing due to both methods just mentioned are high level processes. The HVS-based preprocessing is composed of two main processes：discrete wavelet transform (DWT) and quadtree segmentation. DWT is used to produce multi-resolution pyramid images, which is used to provide different features we need in the further processes. Quadtree segmentation known as the most simple but powerful process is used as region segmentation. The segmentation result produces a cluster of homogeneous regions. The further processes execute adaptive processing base on the segmentation level. In order to reach the requirement of real-time process, our proposed algorithm is implemented based on the cell-based design flow. Our design is synthesized with the constraint of 54 MHz and the total gate count is about 4.2M gates. The chip is implemented with TSMC 0.13 um technology. The layout result present the core size is 1.572mm x 1.572mm.	en
dc.description.provenance	Made available in DSpace on 2021-06-15T04:33:27Z (GMT). No. of bitstreams: 1 ntu-98-R96943020-1.pdf: 14416610 bytes, checksum: 8c0e0e2409807bfa999a47657dff28d5 (MD5) Previous issue date: 2009	en
dc.description.tableofcontents	中文摘要 ........................................................................................................................... I Abstract ......................................................................................................................... III 第一章緒論 .................................................................................................................... 1 1.1 立體顯示技術 .............................................................................................. 1 1.2 研究動機 ...................................................................................................... 3 1.3 研究貢獻 ...................................................................................................... 4 1.4. 論文架構 ...................................................................................................... 4 第二章 2D-to-3D 轉換技術的發展概況 ....................................................................... 6 2.1. 三維影像合成系統 ...................................................................................... 6 2.2. Depth-Image-Based-Rendering (DIBR) 影像合成系統 ............................. 7 2.3. 深度圖產生 .................................................................................................. 8 2.3.1. 深度影像 .......................................................................................... 8 2.3.2. 深度攝影機 ...................................................................................... 9 2.3.3. 深度線索 ........................................................................................ 10 2.4. 2D-to-3D 轉換演算法的發展概況 ........................................................... 13 2.4.1. Depth-from-Motion Parallax .......................................................... 13 2.4.2. Depth-from-Blur ............................................................................. 14 2.4.3. Depth-from-Geometric Perspective ................................................ 15 2.4.4. Depth-from-Pictorial Perspective ................................................... 15 2.4.5. Depth-from-Edge Information ........................................................ 16 2.4.6. Depth-from-Other Depth Cues ....................................................... 17 2.5. 總結 ............................................................................................................ 17 第三章 2D-to-3D 即時視訊轉換之演算法設計 ......................................................... 19 3.1. 人類視覺系統 (Human Visual System, HVS) ......................................... 19 3.1.1. 簡介 ................................................................................................ 19 3.1.2. 人眼的多重解析度 ........................................................................ 19 3.1.3. Just-Noticeable-Difference (JND) ................................................. 24 3.1.4. 深度感知線索 (Depth Cues) ......................................................... 24 3.2. 移動物件切割 ............................................................................................ 26 3.2.1. 簡介 ................................................................................................ 26 3.2.2. 背景分離演算法 (Background Subtraction) ................................ 26 3.2.3. Wronskian 改變偵測 (Wronskian Change Detector) ................... 28 3.3. 線段偵測 .................................................................................................... 30 3.3.1. 簡介 ................................................................................................ 30 3.3.2. Canny 運算子邊緣檢測 (Canny Edge Detection) ........................ 31 3.3.3. 霍夫轉換(Hough Transform) ......................................................... 34 3.4. 系統概觀 .................................................................................................... 34 3.5. 影像前處理 – 以四元樹為基礎之空間分析 .......................................... 35 3.5.1. 影像前處理 .................................................................................... 36 3.5.2. 四元樹分割和以JND 為基礎之停止分割標準 ........................... 36 3.6. 以人類視覺系統為基礎之2D-to-3D 即時(Real-Time)視訊轉換演算法 38 3.6.1. 使用之測試影片 ............................................................................ 38 3.6.2. 系統架構圖 .................................................................................... 38 3.6.3. 以人類視覺系統為基礎之空間分析前處理 ................................ 39 3.6.4. 使用Wronskian 改變偵測之多重解析度移動物件切割 ............. 42 3.6.5. 多重解析度消失線偵測 ................................................................ 44 3.6.6. 消失點擷取 .................................................................................... 47 3.6.7. 深度圖重建 .................................................................................... 48 3.7. 實驗結果 .................................................................................................... 49 3.8. 結論 ............................................................................................................ 56 第四章2D-to-3D 即時視訊轉換之硬體架構設計 ...................................................... 58 4.1. 系統架構 .................................................................................................... 58 4.2. 硬體架構 .................................................................................................... 59 4.2.1. Input Net Work ............................................................................... 61 4.2.2. Analysis Processes .......................................................................... 63 第五章晶片實現 .......................................................................................................... 73 5.1. 簡介 ............................................................................................................ 73 5.2. 晶片流程 .................................................................................................... 73 5.3 晶片規格 .................................................................................................... 81 第六章結論 .................................................................................................................. 83 參考文獻 ........................................................................................................................ 84
dc.language.iso	zh-TW
dc.title	以HVS為基礎之2D-to-3D即時視訊轉換演算法及架構設計	zh_TW
dc.title	2D-to-3D Real Time Video Conversion Algorithm and Architecture on the Basis of HVS	en
dc.type	Thesis
dc.date.schoolyear	97-2
dc.description.degree	碩士
dc.contributor.coadvisor	范育成(Yu-Cheng Fan)
dc.contributor.oralexamcommittee	郭景致(Chih-Ching Kuo),李揚漢(Yang-Han Lee)
dc.subject.keyword	立體顯示技術,二維轉立體,人類視覺系統,深度重建,視訊轉換,	zh_TW
dc.subject.keyword	3D Display technology,2D-to-3D conversion,human visual system,depth reconstruction,video transformation,	en
dc.relation.page	89
dc.rights.note	有償授權
dc.date.accepted	2009-08-19
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	電子工程學研究所	zh_TW
顯示於系所單位：	電子工程學研究所

文件中的檔案：

檔案	大小	格式
ntu-98-1.pdf 目前未授權公開取用	14.08 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。