應用於可攜式裝置之即時影片切割技術與架構設計

Chieh-Chi Kao; 高介其

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/65964

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	簡韶逸(Shao-Yi Chien)
dc.contributor.author	Chieh-Chi Kao	en
dc.contributor.author	高介其	zh_TW
dc.date.accessioned	2021-06-17T00:16:52Z	-
dc.date.available	2017-07-06
dc.date.copyright	2012-07-06
dc.date.issued	2012
dc.date.submitted	2012-07-02
dc.identifier.citation	[1] L. Grady, “Random walks for image segmentation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 28, no. 11, pp. 1768 –1783, 2006. [2] Y.Y. Boykov and M.-P. Jolly, “Interactive graph cuts for optimal boundary & region segmentation of objects in n-d images,” in Proceedings of IEEE International Conference on Computer Vision (ICCV 2001), pp. 105–112 vol.1. [3] C. Rother, V. Kolmogorov, and A. Blake, “GrabCut: interactive foreground extraction using iterated graph cuts,” ACM Transactions on Graphics, vol.23, pp. 309–314, 2004. [4] Y.-Y. Chuang, B. Curless, D. H. Salesin, and R. Szeliski, “A bayesian approach to digital matting,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2001), vol. 2, pp. 264–271. [5] A. Blake, C. Rother, M. Brown, P. Perez, and P. Torr, “Interactive image segmentation using an adaptive gmmrf model,” in Proceedings of European Conference on Computer Vision (ECCV 2004), pp. 428–441. [6] T. H. Kim, K. M. Lee, and S. U. Lee, “Nonparametric higher-order learning for interactive segmentation,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2010), pp. 3201–3208. [7] Yu Fu, Jian Cheng, Zhenglong Li, and Hanqing Lu, “Saliency cuts: An automatic approach to object segmentation,” in Proceedings of International Conference on Pattern Recognition (ICPR 2008), pp. 1–4. [8] Xiaodi Hou and Liqing Zhang, “Saliency detection: A spectral residual approach,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2007), 2007, pp. 1–8. [9] JueWang, Pravin Bhat, R. Alex Colburn, Maneesh Agrawala, and Michael F. Cohen, “Interactive video cutout,” ACM Transactions on Graphics, vol. 24, pp. 585–594, 2005. [10] Y. Boykov and V. Kolmogorov, “An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 26, no. 9, pp. 1124 –1137, 2004. [11] Xue Bai and Guillermo Sapiro, “A geodesic framework for fast interactive image and video segmentation and matting,” in Proceedings of IEEE International Conference on Computer Vision (ICCV 2007), pp. 1–8. [12] Tomoyuki Nagahashi, Hironobu Fujiyoshi, and Takeo Kanade, “Video segmentation using iterated graph cuts based on spatio-temporal volumes,” in Proceedings of Asian Conference on Computer Vision (ACCV 2009), pp.655–666. [13] Tomoyuki Nagahashi, Hironobu Fujiyoshi, and Takeo Kanade, “Image segmentation using iterated graph cuts based on multi-scale smoothing,” in Proceedings of Asian Conference on Computer Vision (ACCV 2007), pp.806–816. [14] Ken Fukuchi, K. Miyazato, A. Kimura, S. Takagi, and J. Yamato, “Saliency-based video segmentation with graph cuts and sequentially updated priors,” in Proceedings of IEEE International Conference on Multimedia and Expo (ICME 2009), pp. 638–641. [15] L. Itti, C. Koch, and E. Niebur, “A model of saliency-based visual attention for rapid scene analysis,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 20, no. 11, pp. 1254 –1259, 1998. [16] Tie Liu, Zejian Yuan, Jian Sun, Jingdong Wang, Nanning Zheng, Xiaoou Tang, and Heung-Yeung Shum, “Learning to detect a salient object,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 33, no. 2,pp. 353 –367, 2011. [17] C. Tomasi and R. Manduchi, “Bilateral filtering for gray and color images,” in Proceedings of IEEE International Conference on Computer Vision (ICCV1998), pp. 839–846. [18] Ce Liu, W.T. Freeman, R. Szeliski, and Sing Bing Kang, “Noise estimation from a single image,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2006), vol. 1, pp. 901–908. [19] Fr’edo Durand and Julie Dorsey, “Fast bilateral filtering for the display of high-dynamic-range images,” in Proceedings of the 29th annual conference on Computer graphics and interactive techniques, New York, NY, USA, 2002, SIGGRAPH ’02, pp. 257–266, ACM. [20] Georg Petschnigg, Richard Szeliski, Maneesh Agrawala, Michael Cohen, Hugues Hoppe, and Kentaro Toyama, “Digital photography with flash and no-flash image pairs,” in ACM SIGGRAPH 2004 Papers, New York, NY, USA, 2004, SIGGRAPH ’04, pp. 664–672, ACM. [21] Johannes Kopf, Michael F. Cohen, Dani Lischinski, and Matt Uyttendaele, “Joint bilateral upsampling,” ACM Transactions on Graphics, vol. 26, no. 3, July 2007. [22] Zeev Farbman, Raanan Fattal, Dani Lischinski, and Richard Szeliski, “Edgepreserving decompositions for multi-scale tone and detail manipulation,” ACM Transactions on Graphics, vol. 27, no. 3, pp. 67:1–67:10, Aug. 2008. [23] Kaiming He, Jian Sun, and Xiaoou Tang, “Guided image filtering,” in ECCV 2010, vol. 6311, pp. 1–14. [24] Yu-Cheng Tseng, Po-Hsiung Hsu, and Tian-Sheuan Chang, “A 124 mpixels/s vlsi design for histogram-based joint bilateral filtering,” IEEE Transactions on Image Processing, vol. 20, no. 11, pp. 3231–3241, nov. 2011. [25] J.-C. Terrillon, M.N. Shirazi, H. Fukamachi, and S. Akamatsu, “Comparative performance of different skin chrominance models and chrominance spaces for the automatic detection of human faces in color images,” in Proceedings of Fourth IEEE International Conference on Automatic Face and Gesture Recognition (FG 2000), pp. 54–61. [26] Kazuma Akamine, Ken Fukuchi, Akisato Kimura, and Shigeru Takagi, “Fully automatic extraction of salient objects from videos in near real time,” The Computer Journal, 2010. [27] P. Kohli and P.H.S. Torr, “Dynamic graph cuts for efficient inference in markov random fields,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 29, no. 12, pp. 2079–2088, 2007. [28] D. Pang, A. Kimura, T. Takeuchi, J. Yamato, and K. Kashino, “A stochastic model of selective visual attention with a dynamic bayesian network,” in Proceedings of IEEE International Conference on Multimedia and Expo (ICME 2008), pp. 1073–1076. [29] C. Rhemann, A. Hosni, M. Bleyer, C. Rother, and M. Gelautz, “Fast costvolume filtering for visual correspondence and beyond,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2011), pp. 3017–3024. [30] Sang-Kyo Han, “An architecture for high-throughput and improved-quality stereo vision processor,” Master Thesis, Department of Electrical Engineering, University of Maryland, College Park, MD, 2010.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/65964	-
dc.description.abstract	隨著通訊技術的進步與多媒體技術的發展，使用者可以在任何時間、地點透過可攜式裝置來連結到網路。各式各樣的應用相繼出現，如上傳照片和影片並和朋友分享、利用圖片來做搜尋、撥打網路視訊電話等等。物體切割技術在前述應用的前處理中，扮演了一個相當重要的角色。然而，受限於行動裝置的運算能力，並非每一個先前所提出的方法都能夠直接套用到可攜式裝置上。在本篇論文中，我們提出了應用於可攜式裝置之即時影片切割技術與架構設計。在影像和視訊處理的領域中，物體切割是一個已經發展已久的研究題目，許多先前所提出的演算法已經可以達到很好的效果。然而，大部分的方法都需要使用者提供輸入有關於切割目標的資訊來輔助運算。在本篇論文中，我們提出了一個應用於可攜式裝置之即時影片切割技術，其包含了一個非監督式顯著物體偵測及切割的技術，可以在不需要使用者提供資訊的前提下切割出目標物體。實驗結果指出，由顯著特徵圖求得的顯著顏色模型能夠取代使用者輸入資訊來自動達成物體切割。本系統同時也提供了改善機制讓使用者能夠修正切割不夠正確的部分，藉由少許的使用者輔助將可大大提升切割的品質。我們所提出的顯著顏色模型不僅能夠應用到Min-cut演算法，同時也可以延伸到其他切割演算法，如matting或非參數模型。因為受限於有限的運算能力，要直接在可攜式裝置上做高解析度的視訊切割是相當困難的。我們利用在較低的解析度上來做即時視訊切割，再將切割出來的物體罩放大至較高的解析度上。而我們選擇了引導濾波器(guided filter)來作為放大物體罩的方法，但是要增加更多的工作量在可攜式裝置上的中央處理器是不可行的。因此，我們提出了可以嵌入可攜式裝置的引導濾波器之硬體架構，可以協助達成高解析度即時視訊物體切割。就我們所知，本論文也是第一個對於引導濾波器的特殊應用積體電路設計實作。我們應用台灣積體電路公司的90奈米製程，本設計可運行在100MHz，能夠處理每秒30幀的FULL-HD(1920x1080)視訊，總邏輯閘數為92,895，總記憶體數量為3,206B。此外，和過往濾波器的特殊應用積體電路設計實作來比較，我們所提出的架構硬體效率也是最佳的。	zh_TW
dc.description.abstract	With the development of communication technology and multimedia applications, users can use their mobile devices to access the Internet at anytime and anywhere. Various kinds of applications emerge like uploading photos and videos and sharing them with friends, searching things by photos, video calling through the Internet, etc. For applications mentioned above, object segmentation plays an important role as a preprocessing stage. However, due to the limited computation capability in mobile devices, not every segmentation approach proposed in the previous works can be adopted on mobile devices directly. In this thesis, the algorithms and architecture designs of real-time video segmentation for mobile devices are presented. Image and video segmentation is a well-developed topic in the image/video processing, and a number of previous works have been proposed with high performance. However, most previous works need user-assistance to provide the prior information of the target object in the segmentation. For the algorithm of real-time video segmentation for mobile devices, an unsupervised scheme combining the salient object detection and segmentation method is proposed in this thesis, which can segment the target object without any prior information from users. The experimental results show that the proposed salient color model derived with salient features can provide prior information with high confidence to generate precise segmentation automatically. The system also provides an effortless way to let user refine the segmentation result, which can greatly improve the performance of segmentation. The proposed color model of salient objects can not only be applied with Min-Cut algorithm, but also extended to more segmentation algorithms, like matting or non-parametric models. The limited computation capability in mobile devices makes it difficult to achieve real-time segmentation for high-resolution videos. We overcome this limitation by segmenting the object at QVGA and then up-scaling the segmentation result to higher resolution. The guided image filter is adopted as the up-scaling approach; however, it is infeasible to further allocate the workload to the CPU of the mobile platform. Therefore, the hardware architecture of guided image filter is proposed and can be embedded in mobile devices to achieve real-time HD video segmentation. To the best of our knowledge, this work is also the first ASIC design for guided image filter. With TSMC 90nm cell library, the design can operate at 100MHz and support for FULL HD (1920x1080) 30 fps with 92.9K gate counts and 3.2 KB on-chip memory. Moreover, for the hardware efficiency, our architecture is also the best comparing to other previous works with bilateral filter.	en
dc.description.provenance	Made available in DSpace on 2021-06-17T00:16:52Z (GMT). No. of bitstreams: 1 ntu-101-R99943035-1.pdf: 8724537 bytes, checksum: 89c2db92cc44e27e452ae6ce89840718 (MD5) Previous issue date: 2012	en
dc.description.tableofcontents	Abstract ix Chapter 1 Introduction 1 1.1 Image and Video Segmentation . . . . . . . . . . . . . . . . . . . 1 1.2 Guided Image Filter . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.3 Motivation and Design Target . . . . . . . . . . . . . . . . . . . . 5 1.4 Thesis Organization . . . . . . . . . . . . . . . . . . . . . . . . . 6 Chapter 2 Automatic Object Segmentation by Salient Color Model in Image 7 2.1 Saliency Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.2 Salient Color Model . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.3 Segmentation in Image . . . . . . . . . . . . . . . . . . . . . . . 10 2.4 User Refinement . . . . . . . . . . . . . . . . . . . . . . . . . . 12 Chapter 3 Automatic Object Segmentation by Salient Color Model in Video 13 3.1 Salient Color Model for Video . . . . . . . . . . . . . . . . . . . 13 3.2 Segmentation in Video . . . . . . . . . . . . . . . . . . . . . . . 14 Chapter 4 Experimental Results of Segmentation by Salient Color Model 17 4.1 Automatic Image Segmentation . . . . . . . . . . . . . . . . . . 17 4.1.1 Parameter Settings . . . . . . . . . . . . . . . . . . . . . 18 4.1.2 Error Rate of Segmented Results . . . . . . . . . . . . . . 18 4.1.3 Subjective Segmentation Result Comparison . . . . . . . 20 4.2 Effortless Video Segmentation . . . . . . . . . . . . . . . . . . . 21 4.2.1 Parameter Settings . . . . . . . . . . . . . . . . . . . . . 21 4.2.2 Performance of Segmented Results . . . . . . . . . . . . 22 4.2.3 Stability Check . . . . . . . . . . . . . . . . . . . . . . . 24 4.2.4 Scalability Check . . . . . . . . . . . . . . . . . . . . . . 26 4.2.5 Real-Time Video Segmentation . . . . . . . . . . . . . . 28 4.2.6 Upsampling Techniques . . . . . . . . . . . . . . . . . . 29 Chapter 5 Architecture Design of Guided Filter 35 5.1 Guided Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 5.2 Design Challenges . . . . . . . . . . . . . . . . . . . . . . . . . 37 5.3 Design of Hardware Architecture . . . . . . . . . . . . . . . . . . 39 5.4 Coefficient Kernel Engine . . . . . . . . . . . . . . . . . . . . . . 40 5.4.1 Reformation of ak Formula . . . . . . . . . . . . . . . . . 40 5.4.2 Architecture of ak Kernel . . . . . . . . . . . . . . . . . . 41 5.4.3 Reformation of bk Formula . . . . . . . . . . . . . . . . . 42 5.4.4 Architecture of bk Kernel . . . . . . . . . . . . . . . . . . 43 5.4.5 Fraction Analysis . . . . . . . . . . . . . . . . . . . . . . 43 5.5 Output Kernel Engine . . . . . . . . . . . . . . . . . . . . . . . . 44 5.6 Double Integral Image Architecture . . . . . . . . . . . . . . . . 45 5.6.1 Dataflow of Double Integral Image Architecture . . . . . 45 5.6.2 Boundary Handling . . . . . . . . . . . . . . . . . . . . . 48 5.6.3 Computation Cycle and Hardware Utilization . . . . . . . 48 5.7 Implementation Result . . . . . . . . . . . . . . . . . . . . . . . 49 Chapter 6 Conclusion 53 Bibliography 55
dc.language.iso	en
dc.title	應用於可攜式裝置之即時影片切割技術與架構設計	zh_TW
dc.title	Algorithm and Architecture Design of Real-time Video Segmentation in Mobile Devices	en
dc.type	Thesis
dc.date.schoolyear	100-2
dc.description.degree	碩士
dc.contributor.oralexamcommittee	黃仲陵(Chung-Lin Huang),徐宏民(Winston H. Hsu),莊永裕(Yung-Yu Chuang),洪士灝(Shih-Hao Hung)
dc.subject.keyword	視訊切割,引導濾波器,硬體架構,	zh_TW
dc.subject.keyword	video segmentation,guided filter,hardware architecture,	en
dc.relation.page	59
dc.rights.note	有償授權
dc.date.accepted	2012-07-02
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	電子工程學研究所	zh_TW
顯示於系所單位：	電子工程學研究所

文件中的檔案：

檔案	大小	格式
ntu-101-1.pdf 目前未授權公開取用	8.52 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。