基於圖片內容擷取之圖像語意擴充與其影片廣告安插之應用

Wei-Shing Liao; 廖瑋星

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/45262

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	徐宏民(Winston H. Hsu)
dc.contributor.author	Wei-Shing Liao	en
dc.contributor.author	廖瑋星	zh_TW
dc.date.accessioned	2021-06-15T04:11:20Z	-
dc.date.available	2011-02-04
dc.date.copyright	2010-02-04
dc.date.issued	2010
dc.date.submitted	2010-01-27
dc.identifier.citation	[1] Florent Monay, Daniel Gatica-Perez. PLSA-based Image Auto-Annotation: Constraining the Latent Space, ACM Multimedia 2004. [2] Rainer Liehart, Malcolm Slaney. PLSA on large scale databases, ICASSP 2007. [3] Eva Horster, Rainer Lienhart, Malcolm Slaney. Image Retrieval on Large-Scale Image Databases, CIVR 2007. [4] Xiaofei He, Deng Cai, Ji-Rong We, Wei-Ying Ma, Hong-Jinag Zhang. ImageSeer: Clustering and Searching WWW images using link and page layout Analysis, Microsoft TechReport 2004. [5] Changhu Wang, Feng Jing, Lei Zhang, Hong-Jiang Zhang Image Annotation Refinement using Random Walk with Restarts. ACM Multimedia 2006. [6] Xin-Jing Wang, Wei-Ying Ma, Cui-Ring Xue. Multi-Model Similiar Propagation and its Application for Web Image Retrieval. ACM Multimedia 2004. [7] Mehran Sahami, Tim Heilman. A Web-based Kernel Function for Matching Shot Text Snippets. WWW ’06. [8] Apostol Natsev, Alexander Haubold, Jelena Tesic, Lexing Xie, Rong Yan: Semantic concept-based query expansion and re-ranking for multimedia retrieval. ACM Multimedia 2007. [9] M. Ames and M. Naaman. Why we tag: motivations for annotation in mobile and online media. In CHI ’07, pages 971–980. ACM, 2007. [10] Akira Yanagawa, Shih-Fu Chang, Lyndon Kennedy, Winston Hsu. Columbia University's Baseline Detectors for 374 LSCOM Semantic Visual Concepts. Research Report Columbia University, March 2007. [11] Y. Li, K. Wan, X. Yan, and C. Xu. Advertisement insertion in baseball video based on advertisement effect. In Proceedings of ACM Multimedia, pages 343-346, Singapore, November 2005. [12] Wei-Shing Liao, Kuan-Ting Chen, Winston Hsu. AdImage: video advertising by image matching and ad scheduling optimization. ACM SIGIR 2008. [13] X.-J. Wang, L. Zhang, F. Jing, and W.-Y. Ma. Annosearch: Image auto-annotation by search. In Proceedings of CVPR, 2006. [14] T Yeh, JJ Lee, T Darrell. Photo-based Question Answering. ACM Multimedia 2008. [15] Noah Snavely, Steven M. Seitz , Richard Szeliski, Photo tourism: exploring photo collections in 3D, ACM Transactions on Graphics (TOG), v.25 n.3, July 2006. [16] B. S. Manjunath and W.Y. Ma, 'Texture features for browsing and retrieval of image data,' IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), vol.18, no.8, pp.837-42, Aug 1996. [17] Salton, G. and Buckley, C. 1988 Term-weighting approaches in automatic text retrieval. Information Processing & Management 24(5): 513–523. [18] M. La Cascia, S. Sethi, S. Sclaroff. Combining Textual and Visual Cues for Content-Based Image Retrieval on the World Wide Web, Proc. IEEE Workshop on Content - Based Access of Image and Video Libraries, 1998 [19] Lyndon S. Kennedy, Shih-Fu Chang, Igor V. Kozintsev, To search or to label?: predicting the performance of search-based automatic image classifiers, Proceedings of the 8th ACM international workshop on Multimedia information retrieval, 2006 [20] Lyndon Kennedy, Mor Naaman, Shane Ahern, Rahul Nair , Tye Rattenbury, How flickr helps us make sense of the world: context and content in community-contributed media collections, ACM Multimedia 2007. [21] Thomas Hofmann, Probabilistic latent semantic indexing, Proceedings of the 22nd annual international ACM SIGIR 1999. [22] D. Lowe. Distinctive image features from scale-invariant keypoints. International Journal on Computer Vision, vol. 60, no. 2, pages 91–110, 2004. [23] M. Flickner, H. Sawhney, W. Niblack, J. Ashley, Q. Huang, B. Dom, M. Gorkani, J. Hafner, D. Lee, D. Petkovic, D. Steele, and P. Yanker. Query by image and video content: The QBIC system. Computer, 28(9):23-32, Sept. 1995. [24] A. Pentland, R. Picard, and S. Sclaroff, “Photobook: Content-based manipulation of image databases,” IJCV, vol. 18, no. 3, pp. 233–254, 1996. [25] W.Y. Ma and B.S. Manjunath, NeTra: a toolbox for navigation large image databases, Multimedia Systems 1999. [26] K Barnard, NV Shirahatti. A method for comparing content based image retrieval methods. Internet Imaging IX, Electronic Imaging, 2003. [27] A. N. Langville and C. D. Meyer. A survey of eigenvector methods of Web information retrieval. SIAM Review, accepted, 2003. [28] Y. Ma R. Bayardo and R. Srikant. Scaling up all-pairs similarity search. In WWW,2007. [29] Y.-H. Yang, et. al. ContextSeer: context search and recommendation at query time for shared consumer photos. In ACM Multimedia, 2008. [30] D. Blei, Y. Andrew, and M. Jordan. Latent dirichlet allocation. Journal of Machine Learning Research, 2003. [31] R. Cilibrasi and P. M. B. Vitanyi. The google similarity distance. IEEE Transactions on Knowledge and Data Engineering, 19:370, 2007. [32] Lei Wu, Xian-Sheng Hua, et al. Flickr Distance. ACM Multimedia 2008. [33] J. Sivic and A. Zisserman. Video Google: A text retrieval approach to object matching in videos. In ICCV, 2003. [34] Goodman Andrew, Wining result with Google AdWords, 2008. [35] Jinqiao Wang et al. Online video advertising based on users attention relevancy computing , 2008. [36] L. Kennedy et al. How Flickr Helps us Make Sense of the World: Context and Content in Community-Contributed Media Collections. In ACM Multimedia, Augsburg, Germany, September 2007. [37] A. Mehta et al. AdWords and Generalized On-line Matching. In Proceedings of the 46th Annual IEEE Symposium on Foundations of Computer Science, pages 264-273, Pittsburgh, USA, October 2005. [38] Tao Mei et al. VideoSense – Towards Effective Online Video Advertising. In Proceedings of ACM Multimedia, pages 1075-1084, Augsburg, Germany, September 2007. [39] D. Lowe. Distinctive image features from scale-invariant keypoints. International Journal on Computer Vision, volumn 60, no. 2, pages 91–110, 2004. [40] M. Fischler and R.Bolles. Random sample consensus: A paradigm for model ﬁtting with application to image analysis and automated cartography. Communications of the ACM , volume 24 , issue 6, pages: 381-395, 1981. [41] S. Arya and D. M. Mount. Approximate Nearest Neighbor Searching. Proc. 4th Ann. ACM-SIAM Symposium on Discrete Algorithms (SODA'93), pages , 271-280, 1993. [42] K Mikolajczyk, C Schmid. A performance evaluation of local descriptors. In CVPR, volume 2, pages 257-263, 2003. [43] J. E. Gilbert and Y. Lin. An Effectiveness Study of Online Advertising Models. Entrepreneurship in a Complex Society, Edwin Mellen Press, to appear. [44] Digital Video Ad Format Guidelines and Best Practices, Interactive Advertising Bureau report, April 2008. [45] NIST TREC Video Retrieval Evaluation. http://www-nlpir.nist.gov/projects/trecvid/. [46] http://www.touchgraph.com [47] http://www.semanticmetadata.net/2008/07/03/less-than-20-of-flickr-images-tagged/
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/45262	-
dc.description.abstract	基於影像內容的擷取(CBIR)，是一種有效管理數量以指數成長的圖片之關鍵技術，並可應用於諸如搜尋式的相片標註(annotation by search)，計算相片學(computational photography)和圖片的問答(photo-based question answering)。雖然經過數十年的研究，現有的方法仍然受限於語意隔閡(semantic gap)。於此論文我們提出一種發掘蓬勃發展的多媒體分享服務(如Flickr)和搜尋引擎(如Google)去取得外界知識(諸如:相片、相片標籤、部落格、輔助資料)來建立圖片之間的語意關係來改善CBIR成效的技術。此外，我們也提出一個系統來呈現如何善用語意擴充(semantic expansion)。此系統結合影像比對技術(包含諸如:商標、地標的影像)，自動於影片中安插廣告。此廣告推薦系統也是一種前瞻的服務將有機會增加網路商業效益。因為傳統CBIR的技術都受制於語意隔閡(semantic gap)的問題，於是我們提出一種結合圖片視覺與文字的技術。然而這些由社群網路提供的資料夾帶著大量的雜訊，包括相片標籤等等。於是我們利用Google的知識來計算出這些相片標籤之間的語意相關性。我們利用一種基於圖論為基礎的方法，為每一種線索都建立起一個模組。經由線性組合起來，三種線索建立起的三個模組將會合併成為一個模組。於此，我們利用隨機漫步(random walk)來解此問題，以任一CBIR方法的結果視作初始值，接著以隨機漫步的方式來改善成果。因為每個模組的建構是獨立的，我們將可容易的於線性組合當中改變她們的權重以達不同需求。除此之外，我們也考慮了處理大量資料時效率的問題。同時基於這個架構下，也可延伸應用於基於文字的圖片搜尋(keyword-based image retrieval)與圖片自動註記(image annotation)。為呈現語意擴充的用途，我們也提出了一個廣告推薦系統: AdVis。這是一個基於影像比對技術的自動安插廣告於影片當中的系統。於此，標價者將會對有興趣的影像(AdImage)進行標價，類似於AdWords的系統。此外AdVis著重於系統效益的最佳化並且考慮觀看者的感受。我們將此問題公式化於1-0 integer programming problem的問題。實驗於Flickr資料庫上顯示語意擴充有效解決於: (1)基於圖片內容的圖片搜尋，搜尋效果比起過去方法成長了200%; (2)基於文字的圖片搜尋，搜尋結果相較於傳統的文字搜尋也表現突出，肇因於傳統文字搜尋受限於雜訊與不完整的相片標籤; (3)相片自動標籤，和過去具代表性的方法比較也有大幅度的成長。	zh_TW
dc.description.abstract	Content-based image retrieval (CBIR) is one of the essential techniques for managing exponentially growing photos and the enabling technology for many applications such as annotation by search, computational photography, photo-based question and answering, etc. Though through decades of research, current solutions are limited due to the “semantic gap.” In this work, we argue to improve CBIR by automatically exploiting the auxiliary knowledge (i.e., tags, photos, blogs, metadata, etc.) from the booming media-sharing services (e.g., Flickr) and search engines (e.g., Google); i.e., finding more semantic-related images to enhance CBIR results. To demonstrate the benefits of semantic expansion, we further apply the proposed framework in a promising application – video advertising by target image matching, which automatically associates the relevant ads by content-based matching over related image objects (e.g., logos, scenes, etc.). It is one of the potential applications for Internet monetization as the prevalence of shared videos. Since traditional CBIR methods are hindered by the semantic gap, our approach argues to leverage content and context information widely available in media-sharing services. However, such community-contributed cues (e.g., tags, descriptions, image appearance, metadata) are generally noisy. We measure the semantic similarities between tags by exploiting Google knowledge. We apply graph-based approach, one graph model for each cue. Through aggregating multiple cues (as graphs) in a linear manner, we perform random walk on a unified graph originating from the initial CBIR search results to further improve the precision and recall rates. Since each graph is built independently, the weights for them are adaptive for different applications. We also consider the efficiency issues as deploying in the large-scale media-sharing sites. Meanwhile, the framework is generic and can be extended for other applications such as keyword-based image retrieval and image annotation. To demonstrate the benefits of semantic expansion, we also propose a framework called AdVis, which automatically associates the relevant ads by visual matching. Here, bidder bids ads by the interested images – adImages, which is analogous to keywords in AdWords model. AdVis aims to maximize system revenue and user perception. We formulate the solution as a nonlinear 1-0 integer programming problem. Experimenting over Flickr photo benchmarks, the proposed semantic expansion framework performs saliently in a few aspects: (1) example-based image retrieval: outperforming traditional CBIR systems up to 200%; (2) text-based image retrieval: salient performance gains over conventional keyword-based search since the latter suffers from noisy and missing tags; (3) image auto-annotation: showing significant gains over other search-based image annotation approaches.	en
dc.description.provenance	Made available in DSpace on 2021-06-15T04:11:20Z (GMT). No. of bitstreams: 1 ntu-99-R96922012-1.pdf: 1272844 bytes, checksum: 93a89f3c17399bb5d5837e86437164a3 (MD5) Previous issue date: 2010	en
dc.description.tableofcontents	誌謝 i 中文摘要 ii Abstract iv List of Contents vii List of Figures xi List of Tables xiii Chapter 1. Introduction 1 1.1 Related Works 5 Chapter 2. Semantic Expansion 8 2.1 Overview of Semantic Expansion 8 2.2 Image Feature Extraction 10 2.3 Feature Co-occurrence Capturing 11 2.3.1 Aspect Model 11 2.3.2 Image Graph Construction based on Textual and Visual Co-occurrence Observation (G(1)and G(2)) 12 2.4 Bag-of-words Semantic Similarity Discovering 13 2.4.1 Semantic Meaning of Bag-of-words 13 2.4.2 Efficiency for Bag-of-words 16 2.4.3 Constructing Image Graph based on Tag Semantic Similarity (G(3)) 17 2.5 Image Graph Combination 17 Chapter 3. Example-based Image Retrieval through Semantic Expansion 19 3.1 Semantic Image Retrieval on Bag-of-images 20 3.1.1 Random Walk 21 3.1.2 Random Walk with Restarts 22 3.2 Semantic Image Retrieval on New Resource 23 3.3 Other Applications 24 3.3.1 Keyword-based Image Retrieval 24 3.3.2 Image Auto-annotation 26 Chapter 4. Experiments and Results 28 4.1 Dataset 28 4.2 Example-based Image Retrieval 29 4.2.1 Experiment on In-Social Media Resource 30 4.2.2 Experiment on Out-Social Media Resource 32 4.2.3 Parameter Sensibility 33 4.3 Keyword-based Image Retrieval 37 4.4 Image Auto-annotation 39 Chapter 5. AdVis: Video Advertising by Content-based Matching and Ad Scheduling Algorithm 41 5.1 Related Works 43 5.2 System Framework 44 5.3 Content-based Visual Matching 45 5.4 Ad Scheduling Algorithm 46 5.4.1 AdWords Algorithm 46 5.4.2 Ad Scheduling for Shared Videos 47 5.4.3 Ad Scheduling Optimization 48 5.5 Experiments and Results 50 5.5.1 Evaluation of Ad Scheduling 50 Chapter 6. Conclusions 54 Bibliography 55
dc.language.iso	en
dc.title	基於圖片內容擷取之圖像語意擴充與其影片廣告安插之應用	zh_TW
dc.title	Semantic Query Expansion for Content-based Image Retrieval and Its Application on Video Advertising	en
dc.type	Thesis
dc.date.schoolyear	98-1
dc.description.degree	碩士
dc.contributor.oralexamcommittee	許永真(Jane Yung-Jen Hsu),廖弘源(Hong-Yuan Liao)
dc.subject.keyword	圖片搜尋,語意擴充,相片文字標籤,影片廣告,	zh_TW
dc.subject.keyword	Content-based Image Retrieval,Semantic Query Expansion,Visual Words,Tag,Video Advertising,	en
dc.relation.page	59
dc.rights.note	有償授權
dc.date.accepted	2010-01-27
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	資訊工程學研究所	zh_TW
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
ntu-99-1.pdf 目前未授權公開取用	1.24 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。