Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 電子工程學研究所
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/24987
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor陳良基(Liang-Gee Chen)
dc.contributor.authorYu-Lin Changen
dc.contributor.author張毓麟zh_TW
dc.date.accessioned2021-06-08T05:59:40Z-
dc.date.copyright2007-08-02
dc.date.issued2007
dc.date.submitted2007-07-30
dc.identifier.citation[1] E. H. ADELSON and J. R. BERGEN. The plenoptic function and the elements of
early vision. Computational Models of Visual Processing, MIT Press, Cambridge,
Massachusetts, 1991.
[2] M.I Sezan A.J Patti and A.M Tekalp. Robust methods for high quality stills from
interlaced video in the presence of dominant motion. IEEE Transactions on Circuits
and Systems for Video Technology, 7(2):328–342, April 1997.
[3] A. Almansa, A. Desolneux, and S. Vamech. Vanishing point detection without any
a priori information. IEEE Transactions on Pattern Analysis and Machine Intelligence,
25:502–507, April 2003.
[4] Aladdin M. Ariyaeeinia. Single-camera 3d tv system. Proceedings of SPIE Telemanipulator
and Telepresense Techonologies, 2351:225–232, 1994.
[5] D. Arthur and S. Vassilvitskii. How slow is the k-means method? Proceedings of
the 2006 Symposium on Computational Geometry (SoCG).
[6] S. Battiato, S. Curti, M. La Cascia, M. Tortora, and E. Scordato. Depth map generation
by image classification. Proceedings of SPIE, Three-Dimensional Image
Capture and Applications VI, 5302:95–104, April 2002.
[7] Francois Blais. Review of 20 years of range sensor development. Journal of Electronic
Imaging, 13:231–243, January 2004.
[8] T. Blu, P. Thevenaz, and M. Unser. How a simple shift can significantly improve
the performance of linear interpolation. IEEE International Conference on Image
Processing, pages 377–V380, 2002.
[9] Virginio Cantoni, Luca Lombardi, Marco Porta, and Nicolas Sicard. Vanishing point
detection: Representation analysis and new approaches. Proceedings of the 11th International
Conference on Image Analysis and Processing, pages 90–94, September
2001.
[10] Yu-Lin Chang, Ching-Yeh Chen, Shyh-Feng Lin, and Liang-Gee Chen. Motion
compensated de-interlacing with adaptive global motion estimation and compensation.
IEEE Conference on Image Processing, September 2003.
[11] Yu-Lin Chang, Ching-Yeh Chen, Shyh-Feng Lin, and Liang-Gee Chen. Four field
variable block size motion compensated adaptive de-interlacing. Proceedings of the
2005 IEEE International Conference on Acoustics, Speech, and Signal Processing
(ICASSP 2005), March 2005.
[12] Yu-Lin Chang, Chih-Ying Fang, Li-Fu Ding, and Liang-Gee Chen. Depth map generation
for 2d-to-3d conversion by short-term motion assisted color segmentation.
to appear in International Conference on Multimedia and Expo(ICME), 2007.
[13] Yu-Lin Chang, Shyh-Feng Lin, and Liang-Gee Chen. Extended intelligent edgebased
line average with its implementation and test method. IEEE International
Symposium on Circuits and Systems, pages 341–344, May 2004.
[14] Yu-Lin Chang, Ping-HaoWu, Shyh-Feng Lin, and Liang-Gee Chen. Four field local
motion compensated de-interlacing. IEEE International Conference on Acoustics,
Speech, and Signal Processing, pages 253–256, May 2004.
[15] Wan-Yu Chen, Yu-Lin Chang, and Liang-Gee Chen. Real-time depth image based
rendering hardware accelerator for advanced three dimensional television system.
Proceedings of the 2006 IEEE International Conference on Multimedia and Expo
(ICME 2006), 2006.
[16] Wan-Yu Chen, Yu-Lin Chang, Shyh-Feng Lin, and Liang-Gee Chen. Efficient depth
image based rendering with edge dependent depth filter and interpolation. Proceedings
of the 2005 IEEE International Conference on Multimedia and Expo (ICME
2005), 2005.
[17] Wan-Yu Chen and Liang-Gee Chen. Fast Algorithm and Hardware Architecture for
Stereo Image Synthesis Systems. National Taiwan University, first edition, 2006.
[18] Chul-Ho Choi, Byong-Heon Kwon, and Myung-Ryul Choi. A real-time fieldsequential
stereoscopic image converter. IEEE Transactions on Consumer Electronics,
50(3), August 2004.
[19] G. de Haan and E.B. Bellers. Deinterlacing–an overview. Proceedings of the IEEE,
86(9):1839–1857, September 1998.
[20] G. de Haan and P.W.A.C. Biezen. An efficient true-motion estimator using candidate
vectors from a parametric motion model. IEEE tr. on Circ. and Syst. for Video Techn.,
8(1):85–91, March 1998.
[21] G. de Haan and J. Kettenis. System-on-silicon for high quality display format conversion
and video enhancement. Proceedings of ISCE, pages E1–E6, September
2002.
[22] Gerard de Haan and E.B. Bellers. De-interlacing of video data. IEEE Transactions
on Consumer Electronics, 43:819–825, August 1997.
[23] D. Van de Ville, W. Philips, and I. Lemahieu. Motion compensated de-interlacing
for both real time video and still images. International Conference on Image Processing,
2:680–683, 2000.
[24] T. Doyle. Interlaced to sequential conversion for edtv applications. pages 412–430,
February 1988.
[25] Peter Eisert, Eckehard Steinbach, and Bernd Girod. 3-d shape reconstruction from
light fields using voxel back-projection. Vision, Modeling, and Visualization Workshop,
Erlangen, pages 67–74, 1999.
[26] C. Fehn, K. Hopf, and Q. Quante. Key technologies for an advanced 3d-tv system.
In Proceedings of SPIE Three-Dimensional TV, Video and Display III, pages 66–80,
2004.
[27] Christoph Fehn, Klaas Schr, Ingo Feldmann, Peter Kauff, and Aljoscha Smolic. Distribution
of attest test sequences for ee4 in mpeg 3dav. ISO/IEC JTC1/SC29/WG11
MPEG02/M9219, December 2002.
[28] David J. Fleet. Disparity from local weighted phase-correlation. IEEE International
Conference on Systems, Man, and Cybernetics, 1994. ’Humans, Information and
Technology’, 1:48–54, Oct. 1994.
[29] S. Burak Gokturk, Hakan Yalcin, and Cyrus Bamji. A time-of-flight depth sensor
- system description, issues and solutions. Conference on Computer Vision and
Pattern Recognition Workshop (CVPRW’04).
[30] Steven J. Gortler, Radek Grzeszczuk, Richard Szeliski, and Michael F. Cohen. The
lumigraph. Computer Graphics, 30(Annual Conference Series):43–54, 1996.
[31] H. Gouraud. Continuous shading of curved surfaces. IEEE Transactions on Computers,
20:623–628, 1971.
[32] Nikos Grammalidis and Michael G. Strintzis. Disparity and occlusion estimation
in multiocular systems and their coding for the communication of multiview image
sequences. IEEE Transactions on Circuits and Systems for Video Technology,
8(3):328–344, Jun. 1998.
[33] G. De Haan. Motion compensated de-interlacing, noise reduction, and picture rate
conversion. Trans. Consum. Electron., 45(3):617–V624, August 1999.
[34] Dongil Han, Chang-Yong Shin, Seung-Jong Choi, and Jong-Seok Park. A motion
adaptive 3-d de-interlacing algorithm based on the brightness profile pattern difference.
IEEE Transactions on Consumer Electronics, 45:690–697, August 1999.
[35] R. M. Haralick and L. G. Shapiro. Computer and Robot Vision. Addison Wesley,
Reading, MA, 1992.
[36] Phil Harman, Julien Flack, Simon Fox, and Mark Dowley. Rapid 2d to 3d conversion.
Proc. SPIE Stereoscopic Displays and Virtual Reality Systems IX, 4660:78–86,
May 2002.
[37] Xiaodong Huang and Eric Dubois. Three-view dense disparity estimation with occlusion
detection. International Conference on Image Processing, 2005.
[38] Xiaodong Huang and Eric Dubois. Two-dimensional regularized disparity estimation
based on the gabor transform. In John G. Apostolopoulos and Amir Said, editors,
Visual Communications and Image Processing 2006, volume 6077 of Proc.
SPIE, 2006.
[39] G. J. Iddan and G. Yahav. 3d imaging in the studio and elsewhere. Proceedings of
SPIE Videometrics and Optical Methods for 3D Shape Measurements, pages 48–55,
2001.
[40] T. Iinuma, H. Murata, S. Yamashita, and K. Oyamada. Natural stereo depth creation
methodology for a real-time 2d-to-3d image conversion. SID Symposium Digest of
Technical Papers, 31:1212–1215, May 2000.
[41] H. Jiang and C. Moloney. A new direction adaptive scheme for image interpolation.
IEEE International Conference on Image Processing, pages 369–372, 2002.
[42] Y.-Y. Jung, B.-T. Choi, Y.-J. Park, and S.-J. Ko. An effective de-interlacing technique
using motion compensated interpolation. IEEE Trans. on Consum. Electron.,
46(3):460–V466, August 2000.
[43] Seok-Hoon Kim, Jae-Sung Yoon, Chang-Hyo Yu, Donghyun Kim, Kyusik Chung,
Han Shin Lim, HyunWook Park, and Lee-Sup Kim. A 36fps sxga 3d display processor
with a programmable 3d graphics rendering engine. pages 276–277, February
2007.
[44] Sung-Yeol Kim, Sang-Beom Lee, and Yo-Sung Ho. Three-dimensional natural
video system based on layered representation of depth maps. IEEE Transactions
on Consumer Electronics, 52(3), 2006.
[45] Y.-T. Kim, S.-H. Kim, and S.-W. Park. Motion decision feedback de-interlacing
algorithms. IEEE International Conference on Image Processing, pages 397–V400,
2002.
[46] J. Kovacevic, R.J. Safranek, and E.M. Yeh. De-interlacing by successive approximation.
IEEE Transactions on Image Processing, 6:339–344, February 1997.
[47] Chung J. Kuo, Ching Liao, and Ching C. Jin. Stereoscopic image generation based
on depth images. ICIP, pages 2993–2996, 2002.
[48] Chung J. Kuo, Ching Liao, and Ching C. Lin. Adaptive interpolation technique
for scanning rate conversion. IEEE Transactions on Circuits and Systems for video
technology, 6(3), June 1996.
[49] S.-K. Kwon, K.-S. Seo, J.-K. Kim, and Y.-G. Kim. A motion-adaptive de-interlacing
method. IEEE Trans. Consumer Elec., 38(3):145V–150, August 1992.
[50] Robert Lange and Peter Seitz. Solid-state time-of-flight range camera. IEEE Journal
of Quantum Electronics, 37(3):390–397, March 2001.
[51] Chia-Kai Liang, Gene Liu, and Homer H. Chen. Light field acquisition using programmable
aperture camera. to appear in the Proceedings of the IEEE International
Conference on Image Processing, 2007.
[52] Shyh-Feng Lin, Yu-Lin Chang, and Liang-Gee Chen. Motion adaptive interpolation
with morphological operation and 3:2 pull-downed recovery for de-interlacing.
IEEE International Conference on Multimedia and Expo, August 2002.
[53] Shyh-Feng Lin, Yu-Lin Chang, and Liang-Gee Chen. Motion adaptive deinterlacing
by horizontal motion detection and enhanced ela processing. IEEE International
Symposium on Circuits and Systems, May 2003.
[54] Ye Lu, Jason Z. Zhang, Q. M. Jonathan Wu, and Ze-Nian Li. A survey of motionparallax-
based 3-d reconstruction algorithms. IEEE Transactions on Systems, MAN
and Cybernetics-Part C: Applications and Reviews, 34(4), November 2004.
[55] B. Martins and S. Forchhammer. A unified approach to restoration, de-interlacing
and resolution enhancement in decoding mpeg-2 video. IEEE Transactions on Circuits
and Systems for Video Technology, 12:803–811, September 2002.
[56] Y. Matsumoto, H. Terasaki, K. Sugimoto, and T. Arakawa. Conversion system of
monocular image sequence to stereo using motion parallax. SPIE Photonic West,
3012:108–115, 1997.
[57] Takashi Matsuyama, Xiaojun Wu, Takeshi Takai, and Toshikazu Wada. Real-time
dynamic 3-d object shape reconstruction and high-fidelity texture mapping for 3-d
video. IEEE Transactions on Circuits and System for Video Techonology, 14(3),
March 2004.
[58] Konstantinos Moustakas, Dimitrios Tzovaras, and Michael G. Strintzis. Stereoscopic
video generation based on efficient layered structure and motion estimation
from a monoscopic image sequence. IEEE Transactions on Circuits and Systems
For Video Technology, 15(8), August 2005.
[59] A. Nguyen and E. Dubois. Spatio-temporal adaptive interlaced-to-progressive conversion.
Signal Processing of HDTV, IV, E. Dubois and L. Chiariglione, Eds., Elsevier
Science Publishers, pages 749–756, 1993.
[60] Lionel Oisel, Etienne Memin, Luce Morin, and Franck Galpin. One-dimensional
dense disparity estimation for three-dimensional reconstruction. IEEE Transactions
on Image Processing, 12(9):1107–1119, Sep. 2003.
[61] T. Okino, M. Murata, K. Taima, T. Inimura, and K. Oketani. New television with
2d/3d image conversion technologies. SPIE Photonic West, 2653, 1995.
[62] M. H. Ouali, D. Ziou, and C. Laurgeau. Dense disparity estimation using gabor
filters and image derivatives. Proceedings. Second International Conference on 3-D
Digital Imaging and Modeling, 1999., pages 483–489, Sep. 1999.
[63] M. Pollefeys, L. Van Gool, M. Vergauwen, F. Verbiest, K. Cornelis, J. Tops, and
R. Koch. Visual modeling with a hand-held camera. International Journal of Computer
Vision, 59:207–232, 2004.
[64] LUMIA R, ZUNIGA O, and ZUNIGA O. A new connected components algorithm
for virtual memory computers. Computer Vision Graphics and Image Processing,
22:287–300, 1983.
[65] Andr Redert, Marc Op de Beeck, Christoph Fehn, Wijnand IJsselsteijn, Marc Pollefeys,
Luc Van Gool, Eyal Ofek, Ian Sexton, and Philip Surman. Attest: Advanced
three-dimensional television system technologies. Proceedings of the First
International Symposium on 3D Data Processing Visualization and Transmission
(3DPVT.02), 2002.
[66] C. Ryu and S.P. Kim. De-interlacing using motion compensated local spectra. Conference
Record of the Twenty-Ninth Asilomar Conference on,Signals, Systems and
Computers, 2:1394–1397, 1996.
[67] R.J. Schutten and G. de Haan. Real-time 2-3 pull-down elimination applying motion
estimation/compensation in a programmable device. IEEE Transactions on Consumer
Electronics, 44:930–938, August 1998.
[68] Jefferey A. Shufelt. Performance evaluation and analysis of vanishing point detection
techniques. IEEE transactions on pattern analysis and machine intelligence,
21(3), March 1999.
[69] Christoph Strecha and Luc Van Gool. Pde-based multi-view depth estimation. Proceedings
of the First International Symposium on 3D Data Processing Visualization
and Transmission, 2002.
[70] Kaoru Sugita, Keita Takahashi, Takeshi Naemura, and Hiroshi Harashima. Focus
measurement on programmable graphics hardware for all in-focus rendering from
light fields. Proc. of the IEEE Virtual Reality Conference, pages 255–256, March
2004.
[71] Kenju Sugiyama and Hiroya Nakamura. A method of de-interlacing with motion
compensated interpolation. IEEE Transactions on Consumer Elec., 45(3):611–616,
August 1999.
[72] Changming Sun. De-interlacing of video images using a shortest path technique.
IEEE Transactions on Consumer Electronics, 47:225–230, May 2001.
[73] P. L. Swan. Method and apparatus for providing interlaced video on a progressive
display. U.S. Patent 5864369, January 1999.
[74] Keita Takahashi, Akira Kubota, and Takeshi Naemura. All in-focus view synthesis
from under-sampled light fields. Proc. of the International Conference on Artificial
Reality and Telexistence (ICAT2003).
[75] Keita Takahashi, Akira Kubota, and Takeshi Naemura. A focus measure for light
field rendering. Proc. IEEE Int. Conf. on Image Processing, pages 2475–2478, October
2004.
[76] Keita Takahashi and Takeshi Naemura. All in-focus light field viewer. ACM SIGGRAPH,
August 2004.
[77] Yi-Min Tsai, Yu-Lin Chang, and Liang-Gee Chen. Block-based vanishing line and
vanishing point detection for 3d scene reconstruction. Proceedings of the 2006 International
Symposium on Intelligent Signal Processing and Communication Systems
(ISPACS 2006), 2006.
[78] Gi-Mun Um, Kang Yeon Kim, Chunghyun Ahn, and Kwan Hang Lee. Threedimensional
scene reconstruction using multi-view images and depth camera. Proceedings
of SPIE, Stereoscopic Displays and Virtual Reality Systems XII, 5664:271–
280, 2005.
[79] Wannes van der Mark and Dariu M. Gavrila. Real-time dense stereo for intelligent
vehicles. IEEE Transactions on Intelligent Transportation Systems, 7(1), Mar. 2006.
[80] D. Van De Ville,W. Philips B. Rogge, and I. Lemahieu. De-interlacing using fuzzybased
motion detection. Knowledge-Based Intelligent Information Engineering Systems,
1999. Third International Conference, 1999, pages 263–267.
[81] Bennett Wilburn, Neel Joshi, Vaibhav Vaish, Eino-Ville Talvala, Emilio Antunez,
Adam Barth, Andrew Adams, Mark Horowitz, and Marc Levoy. High performance
imaging using large camera arrays. International Conference on Computer Graphics
and Interactive Techniques, ACM SIGGRAPH 2005 Papers, pages 765–776, 2005.
[82] S. Yang, Y.-Y. Jung, Y. H. Lee, and R.-H. Park. Motion compensation assisted
motion adaptive interlaced-to-progressive conversion. IEEE Trans. Circuits Syst.
Video Technol., 14(9):1138–V1140, September 2004.
[83] Arimitsu Yokota, Takashi Yoshida, Hideaki Kashiyama, and Takayuki Hamamoto.
High-speed sensing system for depth estimation based on depth-from-focus by using
smart imager. Proceedings of the IEEE International Symposium on Circuits And
Systems, pages 564–567, 2005.
[84] C. Zhang and T. Chen. Spectral analysis for sampling image-based rendering data.
IEEE Trans. on CSVT Special Issue on Image-based Modeling, Rendering and Animation,
13:1038–1050, November 2003.
[85] M. Zhao and G. de Haan. Intra-field de-interlacing with advanced up-scaling methods.
IEEE International Symposium on Consumer Electronics, September 2004.
[86] C. Lawrence Zitnick, Sing Bing Kang, Matthew Uyttendaele, Simon Winder, and
Richard Szeliski. High-quality video view interpolation using a layered representation.
ACM Transactions on Graphics, 23, 2004.
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/24987-
dc.description.abstract人類對於視覺效果的追求永不停歇。從黑白電視、彩色電視,直至今日的數位電視,是人類對視覺界限的挑戰,也是科技精益求精的表現。在液晶電視及電漿電視如此蓬勃發展的現在,舊有的交錯式掃瞄電視訊號如何還原為漸進式掃瞄以輸出至此類電視上,畫質和速度對於使用者的影響甚巨。而在液晶電視之外,目前各大廠亦皆已開發出能夠呈現三維立體視覺感的液晶螢幕,也因此預言著下一代的顯示介面將以三維立體視覺為主。在如此的潮流下,立體視覺(Stereo 3D Video)內容的提供及壓縮也將會成為新時代電視的必須。傳統取得立體影像方法,皆需靠額外設備提供資料才得以將這第三維度轉換成深度。但其實有大量的影像資料,在過去早已拍攝成為平面影像供平面電視觀看,這些資料在未來勢必也會大量出現,若搭配立體顯示器卻無法發揮顯示器效果,將會是一種無形的浪費。綜觀人眼視覺的概念,即便不靠雙眼,不靠水晶體多處對焦亦能夠對畫面構築出立體視覺。人眼深度機制包含雙眼視覺、圖像理解及圖像認知層面皆能幫助「看」出物體的深度,也因此IMAX實體電影都需透過專家來進行平面影像至立體影像的轉換。若能參透其中道理並移植至電子產品上,那麼在立體影像的擷取端就可以下更少功夫,直接在電子產品裡將所有的平面影像內容都轉換為立體影像。
為了追求極致的畫面品質,本論文針對此多種不同的世代交替,在舊世代的影像訊號上轉換至新世代的規格時,影像訊號的重建與還原做信號處理。在二維影像方面,本論文提出將交錯式掃瞄影像轉換至非交錯掃瞄的可適應性動像補償去交錯方法突破傳統以動態適應性去交錯方法的限制,大幅拉高重建後影像的解析度。而在三維影像方面,本論文不僅是對於如何擷取三維影像做了總匯的整理,更提出一套二維影像至三維影像即時轉換系統,我們所要提出的二維至三維影像轉換系統,即是透過人腦在立體視覺的概念:包含有雙眼視覺、圖像理解及圖像認知等層面,將這些概念實體化為演算法並且進行硬體設計。使得家庭多媒體平台得以嵌入這套系統,可直接將平面影像轉換為立體影像,使得立體影像生成不再要從擷取端就做起,更使得過去龐大的影像資料在未來有更上一層樓的能力。
此一系統俱備有幾個部份:三種深度重建工具,深度圖融合,以及可調適性的深度影像內差。三種深度重建工具將不同的人類深度線索轉換為演算法使用,深度圖融合控制不同的重建工具,在不同的場景和內容對不同的重建工具進行融合;深度影像內差器再進行深度影像繪製,畫出左右兩眼的影像,以方便輸出至立體顯示器播放。此數種不同的轉換工具,都需要極大量的以點為基礎的運算,在這方面使用硬體加速有其必要性。最後的系統,是一可進行即時的平面影像至立體影像的轉換器。此轉換器可供家庭多媒體平台廠商嵌入其系統,增加出將所有電視影集、運動節目及電影都轉換成立體的功能,充分和立體顯示器結合,發揮其最大效用。
zh_TW
dc.description.abstractHuman are pursuing the reality of vision devices. The video devices improve
from monochrome television to 3D-LCD today. The video signals also vary in all of
these devices. In this dissertation, the video signal conversion for 2D and 3D video
are discussed in two different parts: de-interlacing and 2D-to-3D conversion. The deinterlacing
methods recover the lost data in temporal and spatial domain of a 2D video
sequence. The 2D-to-3D conversion produces the whole dimensional data as the depth
map of a 2D video, then it converts the depth map and 2D video into 3D video.
The transition between interlaced scanned TV signals and progressive scanned
TV signals hindered the quality improvement of the new display panels. Post-processing
such as de-interlacing has become a great index for a TV decoder showing its performance.
In Part I, three kinds of de-interlacing methods are described first: the intrafield
de-interlacing, the motion adaptive de-interlacing, and the motion compensated deinterlacing.
Second, for better de-interlaced image quality, we proposed an intra-field deinterlacing
algorithm named “Extended Intelligent Edge-based Line Average” (EIELA).
Its VLSI module implementation is also stated. Third, for near-perfect de-interlaced image
quality, a de-interlacing algorithm using adaptive global and local motion estimation/-
compensation is proposed. It consists of the global and local motion estimation/compensation,
4-field motion adaptation, the block-based directional edge interpolation, and the
GMC/MC/MA block mode decision module. All defects such as jagged effects, blurring,
line-crawling, and feathering are suppressed lower than the traditional methods. Moreover,
the true motion information is extracted accurately by the 4-field motion estimation
and global motion information.
In Part II, we first make a detailed survey for different kinds of 3D video capturing
methods. There are three kinds of 3D video capturing methods: the active sensor
based methods, the passive sensor based methods, and the 2D-to-3D conversion. After
analyzing the previous works, a real-time automatic depth fusion 2D-to-3D conversion
system is proposed for the home multimedia platform.
In Part III, we tried to convert the binocular, monocular, and pictorial depth cue
to depth reconstruction algorithms. Five novel algorithms and hardware architecture are
presented. The depth reconstruction algorithms can be classified into three categories:
the motion parallax based depth reconstruction which utilizes the binocular depth cue, the
image based depth reconstruction which uses the monocular depth cue, and the consciousness
based depth reconstruction which map the perspective in pictorial depth cue to depth
gradient. After the depth reconstruction, a priority depth fusion algorithm is proposed to
integrate all the depth maps. Then a multiview depth image based rendering method is
presented to provide multiview image rendering technique for the multiview 3D-LCD.
One-dimensional cross search dense disparity estimation is proposed for the motion
parallax based depth reconstruction. The fast algorithm utilizes the characteristics of
the motion parallax and trinocular cameras. As the motion parallax is induced by camera
motion, 1D cross search tends to find better and more smooth results for a true depth map.
A symmetric trinocular property for trinocular camera stereo matching is also described.
Then a 2D full search dense disparity estimation hardware architecture design is designed
for the real-time operation of the motion parallax based depth reconstruction. The dense
disparity estimation needs to calculate the disparity vectors of each depth pixel. With
the features of resolution switching and high specification, the proposed hardware architecture
uses a data assignment unit as a small buffer to achieve a IP-based design. The
hardware can be switched to three different depth pixel resolution in real-time.
Depth from Focus and short-term motion assisted color segmentation are proposed
for the image based depth reconstruction. The DfF method adapts the “blurriness”
characteristic while taking pictures with large aperture camera. After extracting the object
from the blurring areas, the depth of the object is set to the focus distance of the taken
picture. The second image-based depth reconstruction method is the depth map generation
by short-term motion assisted color segmentation. It achieves a smooth depth map
generation both in the spatial and temporal domain. But both methods would face the
moving cameras problem and the tuning of various different type image sequences in the
future. They should be combined with the depth from geometry perspective and other
depth cues to produce more accurate depth map.
For the consciousness based depth reconstruction, we have presented a fundamental
detection algorithm based on the structural components analysis with robustness.
It is suitable for images with distinct object edges. The proposed method for vanishing
line and vanishing point detection provides direct analysis from image structure without
complicated math calculation. The proposed method is feasible for a particular image sequence
without prior temporal information, and guarantees that dominant vanishing lines
are detected correctly with high probability and accuracy. The proposed block-based algorithm
which still holds the regular block data flow feature is much faster, simpler and
efficient. As for the 2D-to-3D conversion procedure, the proposed vanishing line and
point detection gives great help for the overall scene knowledge, and the conversion proceeds
more easily.
After retrieving all the depth maps from different depth cues, we proposed a priority
depth fusion method to integrate the three depth maps. It considers the priority of
the depth maps in six aspects: the scene adaptability, the temporal consistency, perceptibility,
correctness, fineness, and cover area. In order to obtain a comfortable depth map, the six aspects should be deliberated to decide the priority. We also proposed a per-pixel
texture mapping depth image based rendering algorithm which can be accelerated by the
GPU. The proposed algorithm converts points to vertices. Then an image plane represents
the original frame and depth map is constructed. Through the GPU pipeline, the left and
right image can be rendered out. Even for a free viewpoint application, as long as the
GPU draws more than 49.6Mtriangles per second, the multiview DIBR still can run at
real-time. And the proposed DIBR also speed up the previous design for 38 times. After
having these algorithms, an automatic depth fusing 2D-to-3D conversion system is described.
The proposed system generates the depth map of most of the commercial video
with hardware acceleration. With the calculation of GPU, the depth map and the original
2D image are converted to stereo images for showing on the 3D display devices. Huge
amount of the 2D contents such as DVD or TV programs are able to convert and show on
the 3D display devices on the enduser side.
In summary, this dissertation presents an intra-field de-interlacing hardware architecture,
named extended intelligent edge based line average and an adaptive local/-
global motion compensated de-interlacing method for the de-interlacing of 2D video. For
2D video signals to 3D video signals, an automatic depth fusing 2D-to-3D conversion
system is proposed to utilize the human depth cues to convert 2D video to 3D video.
There are five algorithms proposed in this 2D-to-3D conversion system using different
depth cues: one dimensional cross search dense disparity estimation, depth map generation
with short-term motion assisted color segmentation, block-based vanishing line/point
detection, per-pixel multiview depth image based rendering, and priority depth fusion.
There is also a hardware architecture of 2D full search dense disparity estimation implemented
to combine with the whole 2D-to-3D conversion system. The proposed 2D-to-3D
conversion not only produces acceptable depth map for 2D video but also renders multiview
video from the depth information and the 2D video.
en
dc.description.provenanceMade available in DSpace on 2021-06-08T05:59:40Z (GMT). No. of bitstreams: 1
ntu-96-F91943052-1.pdf: 35716432 bytes, checksum: 6c6076c5e2081db683ceb161c9ce0225 (MD5)
Previous issue date: 2007
en
dc.description.tableofcontentsContents vii
List of Figures xxii
List of Tables 1
1 Introduction 3
1.1 Trends in TV Standard . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.1.1 NTSC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.1.2 HDTV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.1.3 3D-TV: ATTEST . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2 Trends in Video Display Device . . . . . . . . . . . . . . . . . . . . . . 6
1.2.1 3D Video Display . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3 Dissertation Organization . . . . . . . . . . . . . . . . . . . . . . . . . . 13
I 2D Video Signal Conversion 15
2 Survey of TV Signal De-interlacing 17
2.1 Aliasing Effect in the sampling of TV . . . . . . . . . . . . . . . . . . . 18
2.2 Conventional De-interlacing Algorithms . . . . . . . . . . . . . . . . . . 20
2.2.1 Intra-field De-interlacing . . . . . . . . . . . . . . . . . . . . . . 21
2.2.2 Motion Adaptive De-interlacing . . . . . . . . . . . . . . . . . . 22
2.2.3 Motion Compensated De-interlacing . . . . . . . . . . . . . . . . 22
2.2.4 GMC De-interlacing . . . . . . . . . . . . . . . . . . . . . . . . 23
2.2.5 Prior Arts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3 The Extended Intelligent Edge-based Line Average Algorithm and Architecture
29
3.1 Extended Intelligent Edge-based Line Average Algorithm . . . . . . . . . 29
3.2 Extended Intelligent Edge-based Line Average - EIELA . . . . . . . . . . 32
3.3 EIELA VLSI Architecture . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.4 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.4.1 Hardware Simulation . . . . . . . . . . . . . . . . . . . . . . . . 37
3.4.2 PSNR Comparison and New Test Sequence . . . . . . . . . . . . 38
3.5 Subjective View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4 Adaptive 4-Field Global/Local motion compensated De-interlacing 43
4.1 Proposed Adaptive 4-Field Global/Local motion compensated De-interlacing 44
4.1.1 Overall Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4.1.2 Global Motion Estimation/Compensation . . . . . . . . . . . . . 47
4.1.3 Same Parity 4-Field Local Motion Estimation . . . . . . . . . . . 51
4.2 Experimental Results and Performance Analysis . . . . . . . . . . . . . . 62
4.2.1 Analysis of Instruction Counts and Hardware Cost . . . . . . . . 62
4.2.2 Objective Performance Comparison . . . . . . . . . . . . . . . . 64
4.2.3 Comparisons of Subjective View . . . . . . . . . . . . . . . . . . 67
4.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
II Survey of 3D Video Capturing 77
5 Survey of 3D Capturing Apparatus 79
5.1 The Principle of Stereo Vision . . . . . . . . . . . . . . . . . . . . . . . 80
5.2 The History of Stereo Vision . . . . . . . . . . . . . . . . . . . . . . . . 80
5.3 Depth Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
5.4 3D Video Capturing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
5.4.1 Active sensor based Method . . . . . . . . . . . . . . . . . . . . 88
5.4.2 Passive sensor based Method . . . . . . . . . . . . . . . . . . . . 89
5.4.3 Signal Processing based Method . . . . . . . . . . . . . . . . . . 93
6 Review of 2D-to-3D Conversion 95
6.1 Depth Cues in Human Vision . . . . . . . . . . . . . . . . . . . . . . . . 96
6.2 Overall Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
6.3 Depth Image based Rendering . . . . . . . . . . . . . . . . . . . . . . . 98
6.4 Depth from focus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
6.5 Image-based Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
6.5.1 Still Image Analysis Method . . . . . . . . . . . . . . . . . . . . 101
6.5.2 Geometry Perspective Method . . . . . . . . . . . . . . . . . . . 102
6.6 Motion-based Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
6.6.1 Depth from motion . . . . . . . . . . . . . . . . . . . . . . . . . 103
6.6.2 Structure from motion . . . . . . . . . . . . . . . . . . . . . . . 104
6.7 Depth Fusion Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
6.8 The Comparison Between the 3D Video Capturing Methods . . . . . . . 105
6.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
III 2D-to-3D Conversion 109
7 Motion Parallax based Depth Reconstruction 111
7.1 Prior Art . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
7.2 Disparity Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
7.3 One-Dimensional Fast Cross Search Dense Disparity Estimation Algorithm116
7.3.1 Problem Definition . . . . . . . . . . . . . . . . . . . . . . . . . 116
7.3.2 Proposed One-Dimensional Fast Cross Search Dense Disparity
Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
7.4 Two-Dimensional Full Search Dense Disparity Estimation Hardware Architecture
Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
7.4.1 Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
7.4.2 Dense Disparity Estimation Architecture . . . . . . . . . . . . . 124
7.5 Experimental Results and Architecture Performance . . . . . . . . . . . . 127
7.5.1 One-Dimensional Fast Cross Search Dense Disparity Estimation
Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . 127
7.5.2 Performance of the Two-Dimensional Full Search Dense Disparity
Estimation Hardware Architecture Design . . . . . . . . . . . 129
7.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
8 Image-based Depth Reconstruction 131
8.1 Proposed Object-based Depth from focus algorithm . . . . . . . . . . . . 132
8.1.1 Focus measure and Depth Estimation . . . . . . . . . . . . . . . 133
8.1.2 Step Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
8.1.3 Depth Interpolation . . . . . . . . . . . . . . . . . . . . . . . . . 135
8.1.4 Erosion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
8.1.5 Re-interpolation . . . . . . . . . . . . . . . . . . . . . . . . . . 136
8.1.6 Clean Background . . . . . . . . . . . . . . . . . . . . . . . . . 137
8.1.7 Efficient Depth Interpolation with Color Segmentation and Watershed
Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 137
8.2 Proposed Depth Map Generation for 2D-to-3D Conversion by Short-Term
Motion Assisted Color Segmentation . . . . . . . . . . . . . . . . . . . . 138
8.2.1 Proposed Method . . . . . . . . . . . . . . . . . . . . . . . . . . 138
8.3 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
8.3.1 Object-based Depth From Focus . . . . . . . . . . . . . . . . . . 145
8.3.2 Short-Term Motion Assisted Color Segmentation Experimental
Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
8.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
9 Consciousness-based Depth Reconstruction 155
9.1 Depth from Geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
9.2 Proposed Block-based Vanishing Line and Point Detection Algorithm . . 157
9.2.1 Edge detection . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
9.2.2 Energy filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
9.2.3 Block object finding . . . . . . . . . . . . . . . . . . . . . . . . 161
9.2.4 Object combination and selection . . . . . . . . . . . . . . . . . 162
9.2.5 Dominant vanishing lines acquisition . . . . . . . . . . . . . . . 164
9.2.6 Vanishing point detection . . . . . . . . . . . . . . . . . . . . . . 165
9.3 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
9.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
10 Depth Image Based Rendering 169
10.1 Prior Arts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
10.1.1 Image shifting DIBR . . . . . . . . . . . . . . . . . . . . . . . . 171
10.2 Proposed Texture Mapping Multi-view DIBR . . . . . . . . . . . . . . . 172
10.3 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
10.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180
11 Automatic Depth Fusing 2D-to-3D Conversion System 181
11.1 Challenge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
11.2 Proposed Overall Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
11.3 Motion Parallax based Depth Reconstruction module . . . . . . . . . . . 184
11.4 Image based Depth Reconstruction module . . . . . . . . . . . . . . . . 185
11.5 Consciousness based Depth Reconstruction module . . . . . . . . . . . . 185
11.6 Priority Depth Fusion module . . . . . . . . . . . . . . . . . . . . . . . 186
11.7 Multiview DIBR module . . . . . . . . . . . . . . . . . . . . . . . . . . 186
11.8 System Hardware/Software Co-Design . . . . . . . . . . . . . . . . . . . 188
11.9 Design Methodology and Flow . . . . . . . . . . . . . . . . . . . . . . . 188
11.10Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
11.11Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
12 Conclusions 199
12.1 Principal Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
12.1.1 Adaptive 4-Field Global/Local motion compensated De-interlacing 200
12.1.2 Automatic Depth Fusing 2D-to-3D Conversion System . . . . . . 201
12.2 Future Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
12.2.1 Multi-Sensor 3D Video Capturing . . . . . . . . . . . . . . . . . 204
12.2.2 Post-processing for 3D-LCD devices . . . . . . . . . . . . . . . 204
12.2.3 Intelligent Vehicles and Robots . . . . . . . . . . . . . . . . . . 205
Bibliography 207
Curriculum Vitae 219
Publication List 225
dc.language.isoen
dc.subject二維至三維影像轉換zh_TW
dc.subject去交錯zh_TW
dc.subject電視後處理zh_TW
dc.subject深度圖生成zh_TW
dc.subject三維影像zh_TW
dc.subjectDe-interlacingen
dc.subject3D videoen
dc.subjectdepth map generationen
dc.subject2D-to-3D conversionen
dc.subjectTV Post-processingen
dc.title二維及三維影像轉換之演算法及架構分析zh_TW
dc.titleAlgorithm and Architecture Analysis of the Video Signal Conversion for 2D and 3D Videoen
dc.typeThesis
dc.date.schoolyear95-2
dc.description.degree博士
dc.contributor.oralexamcommittee陳宏銘,簡韶逸(Shao-Yi Chien),陳永昌,楊家輝,杭學鳴,賴永康,陳美娟
dc.subject.keyword去交錯,電視後處理,二維至三維影像轉換,深度圖生成,三維影像,zh_TW
dc.subject.keywordDe-interlacing,TV Post-processing,2D-to-3D conversion,depth map generation,3D video,en
dc.relation.page228
dc.rights.note未授權
dc.date.accepted2007-07-31
dc.contributor.author-college電機資訊學院zh_TW
dc.contributor.author-dept電子工程學研究所zh_TW
顯示於系所單位:電子工程學研究所

文件中的檔案:
檔案 大小格式 
ntu-96-1.pdf
  未授權公開取用
34.88 MBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved