互動式影片物件切割與影像搜尋

Po-Nung Tseng; 曾柏儂

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/22306

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	徐宏民
dc.contributor.author	Po-Nung Tseng	en
dc.contributor.author	曾柏儂	zh_TW
dc.date.accessioned	2021-06-08T04:15:15Z	-
dc.date.copyright	2010-08-13
dc.date.issued	2010
dc.date.submitted	2010-08-09
dc.identifier.citation	[1] FFmpeg: http://www.ffmpeg.org/. [2] YouTube: http://www.youtube.com/. [3] R. Achanta and M. Kankanhalli. Compressed domain object tracking for automatic indexing of objects. In ICME, pages 61–64, 2002. [4] J.-K. Ahn and C.-S. Kim. Real-time segmentation of objects from video sequences with non-stationary backgrounds using spatio-temporal coherence. In ICIP, pages 1544–1547, 2008. [5] R. Babu, K. Ramakrishnan, and S. Srinivasan. Video object segmentation: A compressed domain approach. CirSysVideo, 14(4):462–474, April 2004. [6] X. Bai and G. Sapiro. A geodesic framework for fast interactive image and video segmentation and matting. In ICCV, pages 1–8, 2007. [7] Y. Boykov and M. Jolly. Interactive graph cuts for optimal boundary and region segmentation of objects in n-d images. In ICCV, pages I: 105–112, 2001. [8] Y. Boykov and V. Kolmogorov. An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision. IEEE Trans. Pattern Anal. Mach. Intell., 26(9):1124–1137, 2004. [9] G. Bradski. The OpenCV Library. Dr. Dobb’s Journal of Software Tools, 2000. [10] A. Criminisi, G. Cross, A. Blake, and V. Kolmogorov. Bilayer segmentation of live video. In CVPR, pages 53–60, 2006. [11] M. Han, W. Xu, and Y. Gong. Video object segmentation by motion-based sequential feature clustering. In ACM MM, pages 773–782, 2006. [12] B. K. P. Horn and B. G. Schunck. Determining optical flow. Artificial Intelligence, 17:185–203, 1981. [13] H. Kalva. The h.264 video coding standard. IEEE MultiMedia, 13(4):86–90, 2006. [14] T. Kanade and B. Lucas. An iterative image registration technique with an application to stereo vision. In IJCAI, pages 674–679, 1981. [15] S. Khan and M. Shah. Object based segmentation of video using color, motion and spatial information. In CVPR, pages II:746–751, 2001. [16] P. Kohli and P. H. S. Torr. Dynamic graph cuts for efficient inference in markov random fields. IEEE Trans. Pattern Anal. Mach. Intell., 29(12):2079–2088, 2007. [17] Y. Li, J. Sun, and H.-Y. Shum. Video object cut and paste. ACM Trans. Graph., 24(3):595–600, 2005. [18] Y. Li, J. Sun, C.-K. Tang, and H.-Y. Shum. Lazy snapping. ACM Trans. Graph., 23(3):303–308, 2004. [19] K. Lin, K. Chen, W. Hsu, C. Lee, and T. Li. Boosting object retrieval by estimating pseudo-objects. In ICIP, pages 785–788, 2009. [20] Z. Liu, Y. Lu, and Z. Zhang. Real-time spatiotemporal segmentation of video objects in the h.264 compressed domain. J. Vis. Comun. Image Represent., 18(3):275–290, 2007. [21] H. Miyamori and K. Tanaka. Webified video: media conversion from tv program to web content and their integrated viewing method. In WWW, pages 946–947, 2005. [22] J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman. Object retrieval with large vocabularies and fast spatial matching. In CVPR, pages 1–8, 2007. [23] I. E. Richardson. The h.264/avc video coding standard for the next generation multimedia communication, 2003. [24] C. Rother, V. Kolmogorov, and A. Blake. ”grabcut”: interactive foreground extraction using iterated graph cuts. ACM Trans. Graph., 23(3):309–314, 2004. [25] J. Sivic and A. Zisserman. Efficient visual search of videos cast as text retrieval. IEEE Trans. Pattern Anal. Mach. Intell., 31(4):591–606, 2009. [26] G. J. Sullivan, S. Member, Ieee, and T. Wiegand. Video compression - from concepts to the h.264/avc standard. In Proceedings of the IEEE, pages 18–31, 2005. [27] J. Sun, W. Zhang, X. Tang, and H. yeung Shum. Background cut. In ECCV, pages 628–641, 2006. [28] K. Szczerba, S. Forchhammer, J. Stottrup-Andersen, and P. T. Eybye. Fast compressed domain motion detection in h.264 video streams for video surveillance applications. In AVSS, pages 478–483, 2009. [29] R. Szeliski, R. Zabih, D. Scharstein, O. Veksler, V. Kolmogorov, A. Agarwala, M. Tappen, and C. Rother. A comparative study of energy minimization methods for markov random fields with smoothness-based priors. IEEE Trans. Pattern Anal. Mach. Intell., 30(6):1068–1080, 2008. [30] L. Vincent and P. Soille. Watersheds in digital spaces: An efficient algorithm based on immersion simulations. IEEE Trans. Pattern Anal. Mach. Intell., 13(6):583–598, 1991. [31] J.Wang, P. Bhat, R. A. Colburn, M. Agrawala, and M. F. Cohen. Interactive video cutout. ACM Trans. Graph., 24(3):585–594, 2005. [32] J. Wang and M. F. Cohen. Image and video matting: a survey. Found. Trends. Comput. Graph. Vis., 3(2):97–175, 2007. [33] Y. H. Yang, P. T. Wu, C. W. Lee, K. H. Lin, W. H. Hsu, and H. H. Chen. Contextseer: context search and recommendation at query time for shared consumer photos. In ACM MM, pages 199–208, 2008. [34] T. Yeh, J. J. Lee, and T. Darrell. Photo-based question answering. In ACM MM, pages 389–398, 2008. [35] A. Yilmaz, O. Javed, and M. Shah. Object tracking: A survey. ACM Comput. Surv., 38(4):13, 2006. [36] T. Yokoyama, S. Ota, and T. Watanabe. Noisy mpeg motion vector reduction for motion analysis. In AVSS, pages 274–279, 2009. [37] W. Zeng, J. Du,W. Gao, and Q. Huang. Robust moving object segmentation on h.264/avc compressed video using the block-based mrf model. RealTimeImg, 11(4):290–299, August 2005. [38] J. Zobel and A. Moffat. Inverted files for text search engines. ACM Computing Surveys, 38(2):6, 2006.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/22306	-
dc.description.abstract	隨著數位影片數量的成長，如何有效地獲取相關資訊成為今日相當重要的問題。消費者在觀看影片時，時常會對出現的物件有興趣，想進一步知道該物件的相關訊息。近年來觸控式面板的蓬勃發展，視覺化的操作方式大幅提升各類產品的互動性。我們提出一個互動式影片物件切割與搜尋系統，希望提供使用者感興趣物件的相關資訊。一般來說，影片的相關文字資訊，如標題、說明、字幕等等，不常與影片中出現的物件直接相關。我們以物件的視覺特徵搜尋外部影像資料庫，而影像伴隨的標籤足以提供這些影片物件更進一步的資訊。由於單一視角的物件常會受限於不同角度，而沒有充足的特徵去搜尋。因此，我們利用影片物件切割，改善搜尋的準確性。同時，為了滿足搜尋的及時性，我們使用影片壓縮的運動向量，改善物件切割的效率。實驗結果顯示，運動向量能大幅提升切割速率且維持搜尋的準確性。	zh_TW
dc.description.abstract	The touch-based displays (devices) have entailed rich interactions between videos and users. The objects appearing in videos usually interest users in wanting to know relative knowledge about them. We propose a video playback system for users to interactively query objects of interest in videos. Since the text information accompanied with videos might not be strongly related to the object of interest, we adopt visual appearances as features to retrieve similar objects from large image collections. The tags associated with the retrieved images are used to reveal related information of the object of interest for further exploiting related knowledge. Solely relying on single viewpoint of the object to query may suffer from different poses, occlusions and is not robust. So we present a novel video object segmentation framework to improve retrieval precision, which is based on a graph cut segmentation. To ensure prompt response and effectiveness, we augment the graph cut algorithm with compressed-domain motion vectors; compared with the prior method, the processing speed of our segmentation framework is significantly improved. The experiments on community-contributed videos demonstrate the effectiveness of our segmentation framework based on multi-frame object region query and the improvement of retrieval precision.	en
dc.description.provenance	Made available in DSpace on 2021-06-08T04:15:15Z (GMT). No. of bitstreams: 1 ntu-99-R97922001-1.pdf: 9976369 bytes, checksum: 4e10aaffdb0a21f7889a8884dc143bf3 (MD5) Previous issue date: 2010	en
dc.description.tableofcontents	Abstract i List of Figures iv List of Tables v Chapter 1 Introduction 1 1.1 Inquiry for Object of Interest in Videos 1 1.2 Image and Video Object Segmentation 3 1.3 Research Contributions 4 Chapter 2 Related Work 5 2.1 Graph Cut Segmentation 5 2.2 Interactive Segmentation 7 2.3 Motion Vectors from Compressed Domain 8 Chapter 3 Motion-augmented Graph Cut Segmentation 9 3.1 GrabCut Segmentation 10 3.2 3D Graph Cut Segmentation 11 3.3 Motion Information 12 3.4 Proposed Frameworks 13 3.4.1 Spatial-temporal Segmentation 13 3.4.2 Frame-by-frame Segmentation 14 3.5 Implementation 15 Chapter 4 Experiments 20 4.1 Datasets 20 4.2 Evaluation 21 4.2.1 Segmentation 21 4.2.2 Retrieval 22 4.3 Result and Discussion 24 4.3.1 Segmentation Accuracy and Efficiency 24 4.3.2 Retrieval Precision 25 4.3.3 Segmentation and Retrieval 27 Chapter 5 Conclusion and Future work 32 Bibliography 33
dc.language.iso	en
dc.subject	圖片搜尋	zh_TW
dc.subject	影片物件切割	zh_TW
dc.subject	壓縮域	zh_TW
dc.subject	運動向量	zh_TW
dc.subject	圖形分割	zh_TW
dc.subject	graph cut	en
dc.subject	image retrieval	en
dc.subject	Video object segmentation	en
dc.subject	compressed domain	en
dc.subject	motion vector	en
dc.title	互動式影片物件切割與影像搜尋	zh_TW
dc.title	Interactive Inquiry for Object of Interest in Video Playback by Motion-Augmented Graph Cut	en
dc.type	Thesis
dc.date.schoolyear	98-2
dc.description.degree	碩士
dc.contributor.oralexamcommittee	簡韶逸,陳信希
dc.subject.keyword	影片物件切割,壓縮域,運動向量,圖形分割,圖片搜尋,	zh_TW
dc.subject.keyword	Video object segmentation,compressed domain,motion vector,graph cut,image retrieval,	en
dc.relation.page	36
dc.rights.note	未授權
dc.date.accepted	2010-08-09
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	資訊工程學研究所	zh_TW
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
ntu-99-1.pdf 未授權公開取用	9.74 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。