基於深度圖之影像處理及其應用

Yu-Ying Wang; 王毓瑩

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/50458

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	貝蘇章(Soo-Chang Pei)
dc.contributor.author	Yu-Ying Wang	en
dc.contributor.author	王毓瑩	zh_TW
dc.date.accessioned	2021-06-15T12:41:35Z	-
dc.date.available	2021-08-02
dc.date.copyright	2016-08-02
dc.date.issued	2016
dc.date.submitted	2016-07-27
dc.identifier.citation	[1] B. Barczak, P. Vandewalle, and et al., “Display-independent 3D-TV production and delivery using the layered depth video format,” IEEE Trans. Broadcast., vol. 57, no. 2, pp. 477－490, 2011. [2] S. Pastoor, C. Fehn, R. Barre, “Interactive 3-DTV Concepts and Key Technologies,” Proceedings of the IEEE, vol. 94, no. 3, pp. 524－538, 2006. [3] M. Lang, A. Hornung, O. Wang, S. Poulakos, A. Smolic and M. Gross, “Nonlinear disparity mapping for stereoscopic 3D,” ACM Trans. Graph. (SIGGRAPH), vol. 29, no. 4, pp. 75:1－75:10, 2010. [4] Van den Bergh, Michael, and et al, “Real-time 3D hand gesture interaction with a robot for understanding directions from humans,” RO-MAN IEEE, pp. 357-362, 2011. [5] Henry, Peter, Michael Krainin, and et al, “RGB-D mapping: Using Kinect-style depth cameras for dense 3D modeling of indoor environments,” The International Journal of Robotics Research vol. 31, no. 5, pp. 647-663, 2012. [6] Kuster, Claudia, and et al, “FreeCam: A Hybrid Camera System for Interactive Free-Viewpoint Video,” Proceedings of VMV, 2011, pp.17-24. [7] Zhou, Junlong, and et al, “Sharp Corner/Edge Recognition in Domestic Environments Using RGB-D Camera Systems,” IEEE Trans. Circuits and Systems II: Express Briefs, vol. 62, no. 10, pp. 987-991, 2015. [8] A. Smolic, K. Mueller, and et al, “Coding algorithms for 3DTV－A survey,” IEEE Trans. Circuits Syst. Video Technol., vol. 17, no. 11, pp. 1606－1621, Nov. 2007. [9] Ozaktas HM, Onural L, editors. “Three-dimensional television: capture, transmission, display,” Springer Science & Business Media; 2007 Nov 13. [10] M. J. Lee, J. W. Lee, and H. K. Lee, “Perceptual watermarking for stereoscopic video using depth information,” in Proc. Int. Conf. Intelligent Information Hiding and Multimedia Signal Processing, 2011, pp. 81-84. [11] Y. H. Lin, and J. L. W, “A digital watermarking for depth-image-based rendering 3D images,” IEEE Trans. Broadcasting, vol. 57, no. 2, pp. 602－611, June 2011. [12] A. G. Bors,“Watermarking mesh-based representations of 3D objects using local moments,” IEEE Trans. Image Process., vol. 15, no. 3, pp. 687－701, Mar. 2006. [13] J. Bennour and J. L. Dugelay, “Protection of 3D object through silhouette watermarking,” in Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing, May 2006, pp. 221－224. [14] A. Koz, C. Cigla, and A. A. Alatan, “Watermarking for image based rendering via homography-based virtual camera location estimation,” in Proc. IEEE Int. Conf. Image Processing, Mar. 2008, pp. 1284－1287. [15] S. C. Chuang, C. H. Huang, and J. L. Wu, “Unseen visible watermarking,” in Proc. IEEE Int. Conf. Image Processing, Oct. 2007, pp. 261－264. [16] C. H. Huang, S. C. Chuang, Y. L. Huang, and J. L. Wu, “Unseen visible watermarking: A novel methodology for auxiliary information delivery via visual contents,” IEEE Trans. Information Forensics and Security, vol. 4, no. 2, pp. 193－206, June 2009. [17] Y. H. Lin, and J. L. Wu, “Unseen visible watermarking for color plus depth map 3D images,” in Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing,2012, pp. 1801－1804. [18] A. Redert, M. O. de Beeck, C. Fehn, W. Ijsselsteijn, M. Pollefeys, L. Van Gool, E. Ofek, I. Sexton, and P.Surman, “ATTEST－Advanced three-dimensional television system techniques,” in Proc. Int. Symp. 3D Data Process. Visual. Transm., Jun. 2002, pp. 313－319. [19] Yang, Jingyu, et al. “Depth recovery using an adaptive color-guided auto-regressive model,” in European Conf. Computer Vision 2012, pp. 158-171. [20] Liu, Shaoguo, et al. “Kinect depth restoration via energy minimization with TV 21 regularization,” in Proc. IEEE Int. Conf. Image Processing (ICIP), 2013, pp. 724. [21] Shen, Ju, and Sen-Ching Samson Cheung. “Layer depth denoising and completion for structured-light rgb-d cameras,” in Proc. IEEE Int. Conf. Computer Vision and Pattern Recognition (CVPR), 2013, pp.1187-1194. [22] Camplani, Massimo, Tomas Mantecon, and Luis Salgado. “Accurate depth-color scene modeling for 3D contents generation with low cost depth cameras.” in Proc. IEEE Int. Conf. Image Processing (ICIP), 2012, pp.1741-1744. [23] Qi, Fei, et al. “Structure guided fusion for depth map inpainting,” Pattern Recognition Letters, vol. 34, no. 1, pp. 70-76, 2013. [24] Camplani, Massimo, Tomas Mantecon, and Luis Salgado. “Depth-color fusion strategy for 3-d scene modeling with Kinect,” IEEE Trans. Cybernetics, vol. 43, no.6, pp.1560-1571, 2013. [25] B. Anda, “Electronic Sensory Systems for the Visually Impaired,” IEEE Magazine on Instrumentation and Measurements, vol. 6, no. 2, pp. 62–67, June 2003. [26] A. Dodds, D. Clark-Carter, and C. Howarth, “The sonic PathFinder: an evaluation,” Journal of Visual Impairment and Blindness, vol. 78, no. 5, pp. 206–207, 1984. [27] S. Shoval, J. Borenstein, and Y. Koren, “The NavBelt- A computerized travel aid for the blind based on mobile robotics technology,” IEEE Trans. Biomedical Engineering, vol. 45, no 11, pp. 1376–1386, 1998. [28] M. Roberto, C. James, “Computer vision without sight,” New York, USA, Magazine on Communications of the ACM, vol. 55, no. 1, pp. 96–104, Jan. 2012. [29] S. M. Kosslyn, M. Behrmann, and M. Jeannerod, “The Cognitive Neuroscience of Mental Imagery,” Neuropsychologia, vol. 33, no.11, pp. 1565–1574, 1995. [30] S. Hanneton, M Auvray, and B. Durette. “The Vibe: a versatile vision-to-audition sensory substitution device,” Applied Bionics and Biomechanics, vol. 7, no. 4, pp. 269–276, 2010. [31] J. L. Gonzalez-Mora, A. Rodriguez-Hernandez, L. F. Rodriguez-Ramos, L. Dfaz-Saco, and N. Sosa, “Development of a new space perception system for blind people, based on the creation of a virtual acoustic space,” in Proc. of Int. Work-Conf. Artificial and Natural Neural Networks, 1999, pp. 321–330. [32] B. Peter, and L. Meijer. “An Experimental System for Auditory Image Representations,” IEEE Trans. Biomedical Engineering, vol. 39, no. 2, pp. 112–121, 1992. [33] The vOICe for Android. Available: https://play.google.com/store/apps/detailsid=vOICe.vOICe&hl=en [34] P. Kauff, N. Atzpadin, and et al, “Depth map creation and image-based rendering for advanced 3DTV services providing interoperability and scalability,” Signal Process., Image Commun., Special Issue on 3DTV, vol. 22, no. 2, pp. 217－234, Feb. 2007. [35] Y. Zhao, C. Zhu, Z. Chen, and L. Yu, “Depth no synthesis-error model for view synthesis in 3-D video,” IEEE Trans. Image Process., vol. 20, no. 8, pp. 2221－2228, Aug. 2011. [36] K. Muller, P. Merkle, and T. Wiegand, “3-D video representation using depth maps,” Proceedings of the IEEE, vol. 99, no. 4, pp. 643－656, April 2011. [37] Zhang, L. and Tam, W.J., “Stereoscopic image generation based on depth images for 3D TV,” IEEE Trans. Broadcasting, vol. 51, no. 2, pp.191-199, 2005. [38] K. Müller, A. Smolic, K. Dix, P. Merkle, P. Kauff, and T. Wiegand, “View synthesis for advanced 3D video systems,” EURASIP J. Image Video Process., vol. 2008, no.1, pp. 1-11, Dec.2009. [39] D. Tian, P. Lai, P. Lopez, and C. Gomila, “View synthesis techniques for 3D video,” in Proc. SPIE pp. 74430T-74430T, 2009. [40] Y. Mori, N. Fukushima, T. Yendo, T. Fujii, and M. Tanimoto, “View generation with 3D warping using depth information for FTV,” Signal Process.: Image Commun., vol. 24, no. 1–2, pp. 65–72, Jan. 2009. [41] M. Domanski, M. Gotfryd, and K. Wegner, “View synthesis for multiview video transmission,” in Proc. Int. Conf. Image Process., Comput. Vision Pattern Recog., 2009, pp. 13–16. [42] C. Fehn, “Depth-image-based rendering (DIBR), compression, and transmission for a new approach on 3 DTV,” Proceedings of SPIE 5291, pp. 93–104, 2004. [43] D. Scharstein and R. Szeliski, “A taxonomy and evaluation of dense two-frame stereo correspondence algorithms,” International Journal of Computer Vision, vol.47, pp. 7–42, Apr. 2002. [44] I. P. Howard and B. J. Rogers, Binocular Vision and Stereopsis. New York: Oxford University Press, 1995. [45] L. Zhang and W. J. Tam, “Stereoscopic image generation based on depth images for 3D TV,” IEEE Trans. Broadcasting, vol. 51, no.2, pp. 191–199, 2005. [46] K. Thung, & P. Raveendran, “A survey of image quality measures,” in Proc. Int. Conf. Technical Postgraduates, 2009, pp. 1–4. [47] Z. Wang and A. Bovik, “A universal image quality index,” IEEE Signal Processing Letters, vol. 9, no. 3, pp. 81–84, Mar 2002. [48] Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, “Image quality assessment: from error visibility to structural similarity,” IEEE Trans. Image Processing, vol. 13, no. 4, pp. 600–612, 2004. [49] M. Naor and A. Shamir, “Visual cryptography,” in Proc. EUROCRYPT, 1994, pp. 1–12. [50] MPEG Video Group, “Applications and Requirements on 3D Video Coding,” MPEG document N10358, Feb 2009. [51] Zhang, Yun, et al. “Efficient multiview depth coding optimization based on allowable depth distortion in view synthesis,” IEEE Trans. Image Processing, vol. 23, no.11, pp. 4879-4892, 2014. [52] Canny, John. “A computational approach to edge detection,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. PAMI-8, no. 6, pp. 679-698, 1986. [53] Freeman, William T., and Edward H. Adelson. “The design and use of steerable filters,” IEEE Trans. Pattern Analysis & Machine Intelligence, vol. 13, no.9, pp. 891-906, 1991. [54] Lindeberg, Tony. “Feature detection with automatic scale selection,” International journal of computer vision, vol. 30, no. 2, pp. 79-116, 1998. [55] Arbelaez, Pablo, et al. “Contour detection and hierarchical image segmentation,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 33, no.5, pp. 898-916, 2011. [56] Lim, Joseph, C. Zitnick, and Piotr Dollár. “Sketch tokens: A learned mid-level representation for contour and object detection,” in Proc. IEEE Int. Conf. Computer Vision and Pattern Recognition, 2013, pp. 3158-3165. [57] Chu, Jun, et al. “Edge and corner detection by color invariants,” Optics & Laser Technology, vol. 45, pp. 756 762, 2013. [58] Zheng, Songfeng, Alan Yuille, and Zhuowen Tu. “Detecting object boundaries using low-, mid-, and high level information,” Computer Vision and Image Understanding, vol. 114, no.10, pp. 1055-1067, 2010. [59] Dollar, Piotr, Zhuowen Tu, and Serge Belongie. “Supervised learning of edges and object boundaries,” IEEE Computer Society Conf. Computer Vision and Pattern Recognition, 2006, pp. 1964-1971. [60] Lim, Joseph, C. Zitnick, and Piotr Dollár. “Sketch tokens: A learned mid-level representation for contour and object detection,” in Proc. IEEE Int. Conf. Computer Vision and Pattern Recognition, 2013, pp. 3158-3165. [61] Economopoulos, Aris, and Drakoulis Martakos. “Depth-assisted edge detection via layered scale-based smoothing,” in Proc. IEEE Int. Symp. Image and Signal Processing and Analysis, 2001, pp.149-154. [62] Chen, Weihai, et al. “An improved edge detection algorithm for depth map inpainting,” Optics and Lasers in Engineering, vol. 55, pp. 69-77, 2014. [63] Di Zenzo, Silvano. “A note on the gradient of a multi-image,” Computer vision, graphics, and image processing,vol. 33, no.1, pp. 116-125, 1986. [64] Arce, Gonzalo R. “Nonlinear signal processing: a statistical approach,” John Wiley & Sons, 2005. [65] Chen, Chongyu, et al. “Kinect Depth Recovery Using a Color-guided, Region-adaptive, and Depth-selective Framework,” ACM Trans. Intelligent Systems and Technology, vol. 6, no.2, pp. 12, 2015. [66] Scharstein, Daniel, and Chris Pal. “Learning conditional random fields for stereo,” in Proc. IEEE Int. Conf. Computer Vision and Pattern Recognition, 2007, pp. 1 8. [67] Yang, Jingyu, et al. “Color-guided depth recovery from RGB-D data using an adaptive autoregressive model,” IEEE Trans. Image Processing, vol. 23, no. 8 pp. 3443-3458, 2014. [68] Online simulated Kinect dataset, http://cs.tju.edu.cn/faculty/likun/ projects/depth recovery/index.htm. [69] Lai, Kevin, et al. “A large-scale hierarchical multi-view rgb-d object dataset,” IEEE Int. Conf. Robotics and Automation, 2011, pp. 1817-1824. [70] Bleiweiss, Amit, et al. “Enhanced interactive gaming by blending full-body tracking and gesture animation,” ACM SIGGRAPH ASIA Sketches, pp. 34, 2010. [71] Abramov, Alexey, et al. “Depth-supported real-time video segmentation with the Kinect,” IEEE Workshop on Applications of Computer Vision, 2012, pp. 457-464. [72] Wilson, Andrew D. “Using a depth camera as a touch sensor,” ACM Int. Conf. interactive tabletops and surfaces, 2010, pp. 69-72. [73] Liu, Wei, et al. “RGB-D depth-map restoration using smooth depth neighborhood supports,” Journal of Electronic Imaging, vol. 24, no. 3, pp. 033015-033015, 2015. [74] Shen, Ju, and Sen-Ching Cheung. “Layer depth denoising and completion for structured-light rgb-d cameras,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2013, pp. 1187-1194. [75] Telea, Alexandru. “An image inpainting technique based on the fast marching method,” Journal of graphics tools, vol. 9, no. 1, pp. 23-34, 2004. [76] Matyunin, Sergey, et al. “Temporal filtering for depth maps generated by Kinect depth camera,” in 3DTV IEEE Conf. The True Vision-Capture, Transmission and Display of 3D Video, 2011, pp. 1-4. [77] Chen, Li, Hui Lin, and Shutao Li. “Depth image enhancement for Kinect using region growing and bilateral filter,” IEEE Int. Conf. Pattern Recognition, 2012, pp. 3070-3073. [78] Petschnigg, Georg, et al. “Digital photography with flash and no-flash image pairs,” ACM Trans. graphics, vol. 23, no. 3, pp. 664-672, 2004. [79] Camplani, Massimo, and Luis Salgado. “Efficient spatio-temporal hole filling strategy for kinect depth maps,” IS&T/SPIE Electronic Imaging. International Society for Optics and Photonics, 2012, pp. 82900E-82900E. [80] Liu, Junyi, Xiaojin Gong, and Jilin Liu. “Guided inpainting and filtering for Kinect depth maps,” Int. IEEE Conf. Pattern Recognition, 2012, pp. 2055-2058. [81] Cho, Ji-Ho, et al. “Dynamic 3D human actor generation method using a time-of-flight depth camera,” IEEE Trans. Consumer Electronics, vol. 54, no. 4 pp. 1514-1521, 2008. [82] Miao, Dan, et al. “Texture-assisted Kinect depth inpainting,” IEEE Int. Symp. Circuits and Systems, May 2012, pp. 604-607. [83] Eberly, David. “Least squares fitting of data,” Chapel Hill, NC: Magic Software, 2000. [84] Kim, Sung-Yeol, Manbae Kim, and Yo-Sung Ho. “Depth image filter for mixed and noisy pixel removal in RGB-D camera systems,” IEEE Trans. Consumer Electronics, vol. 59, no. 3, pp. 681-689, 2013. [85] Gonzalez, Rafael C., and Richard E. Woods. “Digital image processing,” 2002, pp. 116-141. [86] Lam, Louisa, Seong-Whan Lee, and Ching Y. Suen. “Thinning methodologies-a comprehensive survey,” IEEE Trans. pattern analysis and machine intelligence, vol.14, no. 9, pp. 869-885, 1992. [87] Wang, Zhongyuan, et al. “Trilateral constrained sparse representation for Kinect depth hole filling,” Pattern Recognition Letters, vol.65, pp. 95-102, 2015. [88] http://vision.princeton.edu/data/dataset/user/shurans@princeton.edu/SUN-RGBD/ [89] Karim, Hezerul Abdul, et al. “Reduced resolution depth coding for stereoscopic 3D video,” IEEE Trans. Consumer Electronics, vol.56, no. 3, pp. 1705-1712, 2010. [90] Shotton, Jamie, et al. “Real-time human pose recognition in parts from single depth images,” Communications of the ACM, vol.56, no. 1, pp. 116-124, 2013. [91] Hung, Edson M., et al. “Transform-domain super-resolution for multiview images using depth information,” IEEE European Conf. Signal Processing, 2011, pp. 398-401. [92] Rana, Pravin Kumar, and Markus Flierl. “View interpolation with structured depth from multiview video,” IEEE European Conf. Signal Processing, 2011, pp. 383-387. [93] Morris, Tim, et al. “Clearspeech: A display reader for the visually handicapped,” IEEE Trans. Neural Systems and Rehabilitation Engineering, vol. 14, no. 4, pp. 492-500, 2006. [94] Karabetsos, Sotiris, et al. “Embedded unit selection text-to-speech synthesis for mobile devices,” IEEE Trans. Consumer Electronics, vol. 55, no .2, pp. 613-621,2009. [95] Weber, Michael, Martin Humenberger, and Wilfried Kubinger. “A very fast census-based stereo matching implementation on a graphics processing unit,” IEEE Int. Conf. Computer Vision Workshops (ICCV Workshops), 2009, pp. 786-793. [96] Humenberger, Martin, et al. “A fast stereo matching algorithm suitable for embedded real-time systems,” Computer Vision and Image Understanding, vol. 114, no. 11, pp. 1180-1202, 2010. [97] Kuhn, Michael, et al. “Efficient ASIC implementation of a real-time depth mapping stereo vision system,” IEEE Midwest Symp. Circuits and Systems, 2003, pp. 1478-1481. [98] Ohta, Yuichi, and Takeo Kanade. “Stereo by intra-and inter-scanline search using dynamic programming,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 2, pp. 139-154, 1985. [99] Kolmogorov, Vladimir, and Ramin Zabih. “Computing visual correspondence with occlusions using graph cuts,” in Proc. IEEE Int. Conf. Computer Vision, 2001, pp. 508-515. [100] Sun, Jian, Nan-Ning Zheng, and Heung-Yeung Shum. “Stereo matching using belief propagation,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 25, no. 7, pp. 787-800, 2003. [101] Hirschmüller, Heiko. “Improvements in real-time correlation-based stereo vision,” in Proc. IEEE Workshop on Stereo and Multi-Baseline Vision, 2001, pp. 141-148. [102] Veksler, Olga. “Fast variable window for stereo correspondence using integral images,” in Proc. IEEE Computer Society Conf. Computer Vision and Pattern Recognition, 2003, pp. I-556 - I-561. [103] Zabih, Ramin, and John Woodfill. “Non-parametric local transforms for computing visual correspondence,” Computer Vision—ECCV'94. Springer Berlin Heidelberg, 1994, pp. 151-158. [104] Hirschmüller, Heiko, and Daniel Scharstein. “Evaluation of stereo matching costs on images with radiometric differences,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 31, no. 9, pp. 1582-1599, 2009. [105] Ambrosch, Kristian, Christian Zinner, and Wilfried Kubinger. “Algorithmic Considerations for Real-Time Stereo Vision Applications.” In MVA, pp. 231-234. 2009. [106] Fröba, Bernhard, and Andreas Ernst. “Face detection with the modified census transform,” in Proc. IEEE Int. Conf. Automatic Face and Gesture Recognition, pp. 91-96, 2004. [107] Comaniciu, Dorin, and Peter Meer. “Mean shift: A robust approach toward feature space analysis,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 24, no. 5, pp.603-619, 2002. [108] Klaus, Andreas, Mario Sormann, and Konrad Karner. “Segment-based stereo matching using belief propagation and a self-adapting dissimilarity measure,” IEEE International Conf. Pattern Recognition, vol. 3, 2006, pp. 15-18. [109] Porta, Marco. “Vision-based user interfaces: methods and applications,” International Journal of Human-Computer Studies, vol. 57, no. 1, pp. 27-73, 2002. [110] Jaimes, Alejandro, and Nicu Sebe. “Multimodal human–computer interaction: A survey,” Computer vision and image understanding, vol. 108, no. 1, pp. 116-134, Nov. 2007.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/50458	-
dc.description.abstract	近幾年，隨著立體匹配演算法的日趨成熟與深度相機(如: Kinect)的快速發展下，與深度圖有關之研究及其應用在影像處理領域裡漸漸受到高度的關注。為了讓基於深度圖之影像處理領域能更完整與豐富，在此博士論文中，我們提出多項先進的基於深度圖之影像處理技術與應用。其中包括利用深度合成無錯模組來實現3D 不可視顯性浮水印技術，深度輔助邊緣檢測演算法，深度圖空洞修補以及供視障用戶使用的聽覺深度圖像系統。深度圖是立體影像生成的主要輸入訊號，所以深度訊息對於立體影像合成的品質有很大的影響。本論文提出的3D 不可視顯性浮水印技術是利用深度合成無錯模組將欲傳遞之輔助信息藏入深度圖內並與3D 影像內容結合，進而不影響合成後的立體影像品質。此外，我們也將此方法也擴展到多視角合成的不可視顯性浮水印。實驗結果顯示，該方法具有較強的穩健性、隱蔽性且完全不影響合成後的立體影像品質。深度相機因為其低廉的價格而被廣泛使用。但是目前從深度相機得到原始的深度圖都有破洞、雜訊、等問題。其中邊緣不完整問題尤其嚴重。因此，利用深度圖前應先將其做破洞修補。本論文提出了一套利用深度輔助邊緣檢測演算法來進行深度圖像修補技術。只有在邊緣同一側的像素被選為參考像素，再通過平面擬合方法來修補破洞。此方法，能使邊緣附近的模糊效果最小化，因為在邊緣異側的相鄰像素不會被列入參考像素。實驗結果顯示，此方法可以產生無破洞、更精確且邊緣更完整的深度圖。最後，我們還提出了一套供視障用戶以不同的方式實現可視化圖像的聽覺深度圖像系統。	zh_TW
dc.description.abstract	Many depth–based applications have gained increasing interest in recent years due to the advances in stereo matching algorithms and depth camera sensor (e.g., Kinect). The use of depth maps facilitates many difficult computer vision tasks and assists the generation of stereoscopic views in Three Dimensional TeleVision (3DTV). In order to achieve a better and complete depth–based image processing field, a number of advanced depth–based applications and image processing technologies are proposed in this dissertation, including the 3D unseen visible watermarking (UVW) using depth no synthesis error (D-NOSE) model, depth-assisted edge detection algorithm, depth hole filling and auditory depth images system for visually impaired users. Depth information is a main input in view synthesis and the quality of synthesized views is very sensitive to the depth information. Therefore, we proposed a 3D UVW scheme based on D-NOSE model for putting the auxiliary information into depth map and jointing the 3D multimedia content together without affecting the quality of synthesized views. Furthermore, this 3D UVW scheme is also extended to multi-view UVW. Experimental results show that our proposed method have strong robustness 、imperceptibility and the quality of synthesized views is totally unaffected. Depth camera is now in widespread use since its low price. However, the quality of the depth map captured by depth camera is degraded by various defects, especially near object boundaries. Hence, before using Kinect depth data, these invalid regions should be filled first. We proposed an efficient hole filling strategy based on a new depth-assisted edge detection algorithm. Only pixels which are on the same side of the edges are selected as reference pixels to predict missing pixels by using plane fitting method. In this way, neighbor pixels on the opposite side are not considered as reference pixels so that the blurring effect near edges is minified. Experimental results demonstrate that the proposed hole filling strategy can generate accurate depth map and has better performance of depth map accuracy than other existing methods. Furthermore, an auditory depth images system is also introduced which can provide a different way to visualize the image for visually impaired users.	en
dc.description.provenance	Made available in DSpace on 2021-06-15T12:41:35Z (GMT). No. of bitstreams: 1 ntu-105-D98942026-1.pdf: 11393107 bytes, checksum: bb928f3b0c622cae648ec2cd5cfe2c77 (MD5) Previous issue date: 2016	en
dc.description.tableofcontents	口試委員會審定書誌謝……………………………………………………………………………………………………………………………………i 中文摘要………………………………………………………………………………………………………………….……..ii Abstract………………………………………………………………………………………………………..………….iii Contents…………………………………………………………………………………………………………………………v List of Figures……………………………………………………………………………………………………….xi List of Tables……………………………………………………………………………………………………..…xx Chapter 1 Introduction 1 1.1 Organization of the Dissertation……………………………………………………2 Chapter 2 Related Works 4 2.1 3D Unseen Visible Watermarking (UVW)…………………………………………4 2.2 Depth Hole Filling…………………………………………………………………………………………6 2.3 Visual Imagery……………………………………………………………………………………………………8 Chapter 3 3D UVW using Depth No Synthesis Error Model (D-NOSE) 12 3.1 Motivation for 3D UVW using D-NOSE Model……………………………13 3.2 Background of View Synthesis in 3DV…………………………………………15 3.2.1 Quantization of Depth Information…………………………………………16 3.2.2 Depth-Image -Based Rendering (DIBR)……………………………………17 3.2.3 View Merging and Hole Filling……………………………………………………19 3.3 Depth No-Synthesis-Error (D-NOSE) Model………………………………20 3.3.1 Rounding of Disparity in Forward Warping………………………21 3.3.2 D-NOSE Model for Error-Free in View Synthesis…………22 3.4 Proposed 3DUVW scheme for View Synthesis……………………………25 3.4.1 Watermark Embedding using D-NOSE Model……………………………25 3.4.2 Watermark Extraction……………………………………………………………………………30 3.5 Implementations and Experimental Results……………………………31 3.5.1 Suitable Region for Watermark Embedding…………………………33 3.5.2 View Synthesis Imperceptibility Under Normal Viewing Condition………………………………………………………………………………………………………………………36 3.5.3 Robustness to JPEG Compression Attack………………………………38 3.5.4 Effects of Image Resizing by bilinear interpolation ……………………39 3.5.5 Comparison with Existing Method………………………………………………40 3.6 Extension of the Proposed 3D UVW…………………………………………………41 3.6.1 Application for 3D Auxiliary Information Delivery……………………………………………………………………………………………………41 3.6.2 Security Protection by Visual Cryptograph……………………42 3.7 Conclusions…………………………………………………………………………………………………………45 Chapter 4 3D UVW using GD-NOSE for Muti-view Synthesis 47 4.1 3DTV concept based on Muti-view Video plus Depth (MVD)……………………………………………………………………………………………………………………………………48 4.2 Depth Based Intermediate View Synthesis………………………………49 4.3 D-NOSE Model for Muti-view Synthesis………………………………………51 4.4 Proposed 3DUVW Scheme for Muti-view Synthesis………………52 4.4.1 Watermark Embedding using Generic D-NOSE Model………52 4.4.2 Watermark Extraction……………………………………………………………………………58 4.5 Implementations and Experimental Results……………………………58 4.5.1 Suitable Region for Watermark Embedding…………………………60 4.5.2 View Synthesis Imperceptibility and Final Results…………………………………………………………………………………………………………63 4.5.3 Effects of JPEG Compression Attack………………………………………65 4.5.4 Effects of Image Resizing by bilinear interpolation………………………………………………………………………………………67 4.6 Conclusions…………………………………………………………………………………………………………69 Chapter 5 Improved Edge Detection Algorithm using RGB-D Data 71 5.1 Motivation for Depth-Assisted Edge Detection…………………72 5.2 Depth-Assisted Edge Detection…………………………………………………………73 5.2.1 Color and Depth Edge Extraction………………………………………………74 5.2.2 Depth-Edge Pixels Classification……………………………………………76 5.2.3 Noisy Depth-Edge Pixels Removal………………………………………………80 5.2.4 Edge Fusion………………………………………………………………………………………………….84 5.3 Experimental Results…………………………………………………………………………………87 5.3.1 Experiments on Kinect-like degradation dataset………………………………………………………………………………………………………87 5.3.2 Experiments on Real-World Kinect Data………………………………89 5.4 Conclusions…………………………………………………………………………………………………………90 Chapter 6 Depth Hole Filling using Depth-Assisted Edge Detection 92 6.1 Motivation of Depth-Assisted Edge Detection for Depth Hole Filling…………………………………………………………………………………………93 6.2 Depth Hole Filling using Depth-Assisted Edge Detection………………………………………………………………………………………………96 6.2.1 Depth Holes Classification……………………………………………………………97 6.2.2 Random Hole Filling………………………………………………………………………………98 6.2.3 Shadow Hole Filling using Edge-Based Strategy………102 6.3 Experimental Results………………………………………………………………………………107 6.3.1 Experiments on Kinect-like degradation dataset…………………………………………………………………………………………………108 6.3.2 Experiments on Real-World Kinect Depth Maps………….112 6.4 Conclusios……………………………………………………………………………………………………….114 Chapter 7 Auditory Depth Images for Visually Impaired Users 115 7.1 Motivation for Visual Imagery………………………………………………………116 7.2 Proposed Census-Based Method for Depth Map……………………118 7.2.1 Stereo Vision and Camera Calibration………………………………119 7.2.2 Census Transform-based Stereo Matching Algorithm…………………………………………………………………………………………………121 7.2.3 Depth Estimation……………………………………………………………………………………126 7.3 Image to Sound Mapping…………………………………………………………………………126 7.3.1 Training System: Depth Image-to-Sound Mapping………127 7.3.2 Non-training System: High-Level Semantic……………………130 7.4 Experimental Results………………………………………………………………………………132 7.4.1 Matching Quality……………………………………………………………………………………133 7.4.2 Depth Image-to-Sound Mapping……………………………………………………134 7.4.3 High-Level Semantic……………………………………………………………………………135 7.4.4 Preliminary User Testing………………………………………………………………137 7.5 Conclusions………………………………………………………………………………………………………139 Chapter 8 Conclusions and Future Work 141 Reference 144
dc.language.iso	en
dc.subject	影像生成	zh_TW
dc.subject	影像生成	zh_TW
dc.subject	深度圖	zh_TW
dc.subject	深度圖像修補	zh_TW
dc.subject	深度合成無錯模組	zh_TW
dc.subject	不可視顯性浮水印	zh_TW
dc.subject	深度圖像修補	zh_TW
dc.subject	深度合成無錯模組	zh_TW
dc.subject	不可視顯性浮水印	zh_TW
dc.subject	深度圖	zh_TW
dc.subject	Depth map	en
dc.subject	Depth hole filling	en
dc.subject	Depth No-Synthesis-Error (D-NOSE)	en
dc.subject	Unseen Visible Watermarking (UVW)	en
dc.subject	View synthesis	en
dc.subject	Depth map	en
dc.subject	Depth hole filling	en
dc.subject	Depth No-Synthesis-Error (D-NOSE)	en
dc.subject	Unseen Visible Watermarking (UVW)	en
dc.subject	View synthesis	en
dc.title	基於深度圖之影像處理及其應用	zh_TW
dc.title	Depth-Based Image Processing and Its Applications	en
dc.type	Thesis
dc.date.schoolyear	104-2
dc.description.degree	博士
dc.contributor.oralexamcommittee	吳家麟,丁建均,鍾國亮,黃文良
dc.subject.keyword	深度圖,影像生成,不可視顯性浮水印,深度合成無錯模組,深度圖像修補,	zh_TW
dc.subject.keyword	Depth map,View synthesis,Unseen Visible Watermarking (UVW),Depth No-Synthesis-Error (D-NOSE),Depth hole filling,	en
dc.relation.page	156
dc.identifier.doi	10.6342/NTU201601412
dc.rights.note	有償授權
dc.date.accepted	2016-07-27
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	電信工程學研究所	zh_TW
顯示於系所單位：	電信工程學研究所

文件中的檔案：

檔案	大小	格式
ntu-105-1.pdf 未授權公開取用	11.13 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。