Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊網路與多媒體研究所
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/51715
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor徐宏民(Winston H. Hsu)
dc.contributor.authorKuan-Ting Chenen
dc.contributor.author陳冠婷zh_TW
dc.date.accessioned2021-06-15T13:45:59Z-
dc.date.available2021-02-15
dc.date.copyright2016-02-15
dc.date.issued2015
dc.date.submitted2015-11-30
dc.identifier.citation[1] Nist trec video retrieval evaluation. http://www-nlpir.nist.gov/projects/trecvid/, 2003.
[2] A digital video advertising overview. Technical report, 2008.
[3] Google invideo, 2008.
[4] S. Arya and D. M. Mount. Approximate nearest neighbor queries in fixed dimensions. In SODA ’93: Proceedings of the fourth annual ACM-SIAM Symposium on Discrete algorithms, pages 271–280, Philadelphia, PA, USA, 1993. Society for Industrial and Applied Mathematics.
[5] L. A. Barroso, J. Dean, and U. Holzle. Web search for a planet: The google cluster architecture. Micro, IEEE,23(2):22–28,2003.
[6] D. Capel. An effective bail-out test for ransac consensus scoring. In British Machine Vision Conference, pages 629–638, 2005.
[7] W.-T. Chu and J.-L. Wu. Explicit semantic events detection and development of realistic applications for broadcasting baseball videos. Multimedia Tools and Application, 38(1):27– 50, 2008.
[8] O. Chum and J. Matas. Matching with prosac - progressive sample consensus. In Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on, pages 220–226, 2005.
[9] O. Chum, J. Matas, and J. Kittler. Locally optimized ransac. In DAGM Symposium on Pattern Recognition, pages 236–243, 2003.
[10] O. Chum, J. Matas, and S. Obdrzalet. Enhancing ransac by generalized model optimization. In ACCV, 2004.
[11] O. Chum, J. Philbin, J. Sivic, M. Isard, and A. Zisserman. Total recall: Automatic query expansion with a generative feature model for object retrieval. In ICCV, 2007.
[12] M. Datar, N. Immorlica, P. Indyk, and V. S. Mirrokni. Locality-sensitive hashing scheme based on p-stable distributions. In Proceedings of the twentieth annual symposium on Computational geometry, SCG ’04, pages 253–262, New York, NY, USA, 2004. ACM.
[13] A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data via the em algorithm. Journal of the Royal Statistical Society. Series B (Methodological), 39(1):1–38, 1977.
[14] M. A. Fischler and R. C. Bolles. Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM,24(6):381–395, 1981.
[15] M.Flickner, H.Sawhney, W.Niblack,J .Ashley, Q.Huang, B.Dom, M.Gorkani, J.Hafner, D. Lee, D. Petkovic, D. Steele, and P. Yanker. Query by image and video content: The qbic system. Computer,28(9):23–32,1995.
[16] J.Gilbert and Y.Lin. An effectiveness study of online advertising models. Entrepreneurship Policy Journal, 2:120–127, 2005.
[17] G. Hamerly and C. Elkan. Learning the k in k-means. In Neural Information Processing Systems, 2003.
[18] J. Hays and A. A. Efros. Im2gps: estimating geographic information from a single image. In CVPR, 2008.
[19] Y. Ke, R. Sukthankar, and L. Huston. An efficient parts-based near-duplicate and sub-image retrieval system. In ACM Multimedia, 2004.
[20] L. Kennedy, M. Naaman, S. Aher, R. Nair, and T. Rattenbury. How flickr helps us make sense of the world: context and content in community-contributed media collections. In MULTIMEDIA’07: Proceedings of the 15th international conference on Multimedia, pages
631–640, New York, NY, USA, 2007. ACM.
[21] Y.-H. Kuo, K.-T. Chen, C.-H. Chiang, and W. H. Hsu. Query expansion for hash-based image object retrieval. In Proceedings of the 17th ACM international conference on Multimedia, MM ’09, pages 65–74, New York, NY, USA, 2009. ACM.
[22] S. Lazebnik, C. Schmid, and J. Ponce. Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In CVPR, 2006.
[23] H.Li, S.M.Edwards, and J.hyun Lee. Measuring the intrusiveness of advertisements:Scale development and validation. Journal of Advertising, 31:37–47,2002.
[24] Y. Li, K. W. Wan, X. Yan, and C. Xu. Real time advertisement insertion in baseball video based on advertisement effect. In MULTIMEDIA’05: Proceedings of the 13th annual ACM international conference on Multimedia, pages 343–346, New York, NY, USA, 2005. ACM.
[25] W.-S.Liao, K. T.Chen, and W.H.Hsu. Adimage:video advertising by image matching and ad scheduling optimization. In ACM SIGIR, 2008.
[26] D. G. Lowe. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60:91–110, 2004.
[27] J. Matas and O. Chum. Randomized RANSAC with Td,d test. In Image and Vision computing, volume 22, pages 837–842, 2004.
[28] J. Matas and O. Chum. Ransomized ransac with sequential probability ratio test. In ICCV’05: Proceedings of the Tenth IEEE International Conference on Computer Vision, pages 1727–1732, 2005.
[29] J. Matas, O. Chum, U. Martin, and T. Pajdla. Robust wide baseline stereo from maximally stable extremal regions. In Proceedings of British Machine Vision Conference, London, 2002.
[30] K. McDonald and A. F. Smeaton. A comparison of score, rank and probability-based fusion methods for video shot retrieval. In CIVR, Singapore, 2005.
[31] A. Mehta, A. Saberi, U. Vazirani, and V. Vazirani. Adwords and generalized on-line matching. In FOCS 05: Proceedings of the 46th Annual IEEE Symposium on Foundations of Computer Science, pages 264–273. IEEE Computer Society, 2005.
[32] T.Mei, J.Guo, X.S.Hua, and F.Liu. Adon:toward contextual overlay in-video advertising. Multimedia Systems, 16:335–344, 2010. 10.1007/s00530-010-0195-8.
[33] T. Mei, X.-S. Hua, L. Yang, and S. Li. Videosense: towards effective online video advertising. In MULTIMEDIA’07: Proceedings of the 15th international conference on Multimedia, pages 1075–1084, New York, NY, USA, 2007. ACM.
[34] K. Mikolajczyk and C. Schmid. A performance evaluation of local descriptors. In Proc. IEEE Conf. on Computer Vision and Pattern Recognition, pages 257–263, 2003.
[35] K. Mikolajczyk and C. Schmid. Scale & affine invariant interest point detectors. Int. J. Comput. Vision, 60(1):63–86, October 2004.
[36] K.Mikolajczyk, T.Tuytelaars, C.Schmid, A.Zisserman, J.Matas, F.Schaffalitzky, T.Kadir, and L. Van Gool. A comparison of affine region detectors. Int. J. Comput. Vision, 65(1-2):43–72, 2005.
[37] M. Naphade, J. R. Smith, J. Tesic, S.-F. Chang, W. Hsu, L. Kennedy, A. Hauptmann, and J.Curtis.Large-scale concept ontology formultimedia. IEEE MultiMedia,13:86–91,2006.
[38] A. P. Natsev, A. Haubold, J. Tesˇic ́, L. Xie, and R. Yan. Semantic concept-based query expansion and re-ranking for multimedia retrieval. In MULTIMEDIA’07: Proceedings of
the 15th international conference on Multimedia, pages 991–1000, New York, NY, USA, 2007. ACM.
[39] D. Nister. Preemptive ransac for live structure and motion estimation. In ICCV ’03: Proceedings of the Ninth IEEE International Conference on Computer Vision, 2003.
[40] A. Pentland, R. W. Picard, and S. Sclaroff. Photobook: Content-based manipulation of imagedatabases. International Journal of Computer Vision, 18:233–254,1996.
[41] J. Philbin, M. Isard, J. Sivic, and A. Zisserman. Object retrieval with large vocabularies and fast spatial matching. In Proc. IEEE Conf. on Computer Vision and Pattern Recognition, 2007.
[42] L. Rabiner and B.-H. Juang. Fundamentals of Speech Recognition. Prentice Hall PTR, 1993.
[43] Y. Rui, T. S. Huang, and S. fu Chang. Image retrieval: Current techniques, promising directions and open issues. Journal of Visual Communication and Image Representation,
10:39–62, 1999.
[44] G. Schwarz. Estimating the dimension of a model. The Annals of Statistics, 6(2):461–464, 1978.
[45] J. Shi and J. Malik. Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell., 22(8):888–905, 2000.
[46] J. Sivic and A. Zisserman. Video google: A text retrieval approach to object matching in videos. In ICCV’03: Proceedings of the Ninth IEEE International Conference on Computer Vision, page 1470, Washington, DC, USA, 2003. IEEE Computer Society.
[47] N. Snavely, S. M. Seitz, and R. Szeliski. Photo tourism: Exploring photo collections in 3D. In SIGGRAPH Conference Proceedings, pages 835–846, New York, NY, USA, 2006. ACM Press.
[48] C.Stauffer and W.E.L.Grimson. Adaptive background mixture models for real-timetracking. volume 2, pages 246–252. IEEE Computer Society, 1999.
[49] M.-C.Tien, Y.-T.Wang, C.-W.Chou, K.-Y.Hsieh, W.-T.Chu, and J.-L.Wu. Event detection in tennis matches based on video data mining. In ICME, pages 1477–1480, 2008.
[50] B. Tordoff and D. Murray. Guided sampling and consensus for motion estimation. In Proc. IEEE Conf. on European Conference on Computer Vision, 2002.
[51] K. Wan, X. Yan, X. Yu, and C. Xu. Robust goal-mouth detection for virtual content insertion. In MULTIMEDIA’03: Proceedings of the eleventh ACM international conference on Multimedia, pages 468–469, New York, NY, USA, 2003. ACM.
[52] J. Wang, Y. Fang, and H. Lu. Online video advertising based on user’s attention relavancy computing. In ICME’08, pages 1161–1164, 2008.
[53] X.-J.Wang, L.Zhang, F.Jing, and W. Y.Ma. Annosearch:Imageauto-annotation by search. In CVPR, 2006.
[54] Z. Wu, Q. Ke, M. Isard, and J. Sun. Bundling features for large scale partial-duplicate web image search. In CVPR, 2009.
[55] A. Yanagawa, S. fu Chang, L. Kennedy, and W. Hsu. Columbia university’s baseline detectors for 374 lscom semantic visual concepts. Technical report, Columbia University ADVENT, 2007.
[56] Y. H. Yang, P. T. Wu, C. W. Lee, K. H. Lin, W. H. Hsu, and H. H. Chen. Contextseer: context search and recommendation at query time for shared consumer photos. In ACM Multimedia, 2008.
[57] T. Yeh, J. J. Lee, and T. Darrell. Photo-based question answering. In ACM Multimedia, 2008.
[58] S.Zhang, Q.Tian, G.Hua, Q.Huang, and S.Li. Descriptive visual words and visual phrases for image applications. In ACM Multimedia, 2009.
[59] Y. Zheng, M. Zhao, S.-Y. Neo, T.-S. Chua, and Q. Tian. Visual synset: Towards a higher-level visual representation. In CVPR, 2008.
[60] J. Zobel and A. Moffat. Inverted files for text search engines. ACM Comput. Surv., 38(2):6, 2006.
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/51715-
dc.description.abstract隨著行動通訊和數位攝影技術的發展和普及,使用者貢獻越來越大量的相片。因此如何能夠有效並且快速的,在這樣巨量的相片資料庫中,讓使用者能藉由相片搜尋,提供給使用者需要的資訊,變成一個很重要的議題。近年來,藉由特徵點或是視覺文字相片搜尋,其效能會受到兩個重要的因素影響:(1)有效率的空間關係驗證;(2)如何有效的表示相片中的物件。因此,在這個研究中,將會探討以樣本為基礎的物件搜尋中,此兩個重要的議題。在第一個議題中,首先我們會探討如何有效率的進行空間關係驗證。在傳統的方法中,此步驟通常都很耗費時間,然而我們觀察到在大規模的影像搜尋中,正確的特徵點比對,通常會集中在搜尋影像的某些區域,這些區域在我們的研究中被稱為熱門區域點。因此,我們提出了對位置區域敏銳的抽樣一致的方法,嘗試去找出對於空間關係驗證中,可靠並且正確的特徵點區域,並且提出一個更新此熱門區域點的有效方法,來加速空間關係驗證的步驟,我們也將此技術應用在以樣本為基礎的嶄新廣告媒合上,以加速並提高其比對的正確性。第二個議題是如何有效的表示相片中的物件,因為影像中常常會有多個物件或是背景,為了克服這樣的問題,我們提出了一個嶄新的擬似物件的概念,也就是影像的特徵點在空間中有鄰近關係的特徵子集合,用以近似影像中的真實物件。經由實驗在真實的相片資料庫中,可以看出我們提出的方法,對於以樣本為基礎的物件搜尋,比起傳統的方式在正確性和效率都會提升很多。zh_TW
dc.description.abstractDue to the exponential growth of image collections, there arise the needs for efficient and effective example-based image search. Recently, image search by matching bags of feature points or visual words has shown an important paradigm, where the search quality is assured by the representation of objects in photos and spatial verification. In this thesis, we aim to investigate two important topics in example-based object retrieval.
The first one is efficient spatial verification, which exploits geometry model between matching candidates for rejecting false positives and entailing further applications. The traditional methods for spatial verification are time-consuming to estimate model parameters iteratively from a set of (noisy) observed data. Instead, we observe that the image matching for large-scale image retrieval often corresponds to certain regions in the query image – the hot spots. Therefore, the aim of the proposed novel approach – Locality Sensitive Sample Consensus (LOCSAC), attempts to explore ”good matches” for accurate geometry model estimation. In addition, an online framework is devised for adaptively updating hot spot regions.
The second one is the representation of objects in photos. Due to a database image representation generally carries mixed information of the entire image which may contain multiple objects and background. To tackle this problem, we propose a novel representation of objects, pseudo-objects – a subset of proximate feature points with its own feature vector to represent a local area, to approximate candidate objects in database images.
Experimenting over consumer photo benchmarks, we will show that the proposed spatial verification method can bring (on the average) 20 folds speed-up over the conventional methods and assure the same or better quality. Besides, we confirm that the proposed pseudo-object can significantly benefit for object retrieval both in accuracy and efficiency.
en
dc.description.provenanceMade available in DSpace on 2021-06-15T13:45:59Z (GMT). No. of bitstreams: 1
ntu-104-D96944008-1.pdf: 7913502 bytes, checksum: b23956ea08b2b1d09891f90739476105 (MD5)
Previous issue date: 2015
en
dc.description.tableofcontents口試委員審定書 iv
Acknowledgments v
摘要 vii
Abstract ix
List of Figures xiii
List of Tables xv
Chapter 1 Introduction 1
Chapter 2 Related Work 5
2.1 ImageSegmentationforObjects ......................... 5
2.2 SpatialPyramidMatching .............................. 6
2.3 BundleFeaturesandVisualPhrases ...................... 7
Chapter 3 Locality Sensitive Spatial Re-ranking for Example-based Object Retrieval .......................... 9
3.1 Introduction ........................................ 9
3.2 FeatureRepresentationsandMatching ................... 10
3.3 LocalitySensitiveSampleConsensus(LOCSAC) ............ 11
3.3.1 HotSpotEstimationbyGMM ............................ 13
3.3.2 OnlineUpdateforHotSpotLearning .................... 14
3.3.3 IncorrectMatchFiltering ........................... 15
3.3.4 HotSpotSelectionforSpatialVerification ............ 16
3.4 Experiments ......................................... 18
3.4.1 DatasetsandEvaluationMethod ....................... 18
3.4.2 The Accuracy of Locality Sensitive Sample Consensus 19
3.4.3 The Efficiency of Locality Sensitive Sample Consensus . 22
Chapter 4 Boosting Example-based Object Retrieval by Automatically Discovered Pseudo-Objects ................. 27
4.1 Introduction ........................................ 27
4.2 DiscoveringPseudo-Objects ........................... 30
4.2.1 TheGridMethod ..................................... 32
4.2.2 TheG-meansMethod .................................. 32
4.2.3 The Gaussian Mixture Model with Bayesian Information CriterionMethod ......................................... 34
4.2.4 ImageScoringBasedonPseudo-Objects ................. 36
4.3 Inverted-FileIndexingforPseudo-Objects .............. 37
4.4 Experiments ......................................... 39
4.4.1 DatasetsandPerformanceMetrics ..................... 39
4.4.2 ExperimentalSetup ................................. 41
4.4.3 Results and Discussions for Pseudo-object Retrieval 42
4.4.4 Results and Discussions for Inverted-file Indexing 45
Chapter 5 Extensive Applications: Contextual Video Advertising by Ad Scheduling Optimization and Example-based Object Retrieval 51
5.1 Introduction ........................................ 51
5.2 RelatedWork ......................................... 55
5.3 AdVisSystemFramework ................................ 58
5.4 AdSchedulingAlgorithm ............................... 61
5.4.1 AdWordsAlgorithm .................................. 61
5.4.2 AdSchedulingforSharedVideos ....................... 62
5.5 Visual Matching for Exploring Rich Contexts in Videos 66
5.6 Experiments ......................................... 69
5.6.1 EvaluationofVisualMatching ........................ 70
5.6.2 EvaluationofAdScheduling .......................... 71
5.7 Discussions ......................................... 75
Chapter 6 Conclusions and Suggestions for Future Work ... 77
6.1 Conclusions ........................................ 77
6.2 SuggestionsforFutureWork ............................ 78
Bibliography ................. 79
dc.language.isoen
dc.subject擬似物件zh_TW
dc.subject空間關係驗證zh_TW
dc.subject物件檢索zh_TW
dc.subject影像搜尋zh_TW
dc.subjectimage matchingen
dc.subjectpseudo-objecten
dc.subjectobject retrievalen
dc.subjectspatial verificationen
dc.title在巨量資料中進行以樣本為基礎的物件搜尋zh_TW
dc.titleExample-based Object Retrieval in Large-Scale User-Contributed Photosen
dc.typeThesis
dc.date.schoolyear104-1
dc.description.degree博士
dc.contributor.oralexamcommittee歐陽明(Ouhyoung Ming),陳信希(Hsin-Hsi Chen),鄭文皇(Wen-Huang Cheng),陳祝嵩(Chu-song Chen),余能豪(Neng-Hao Yu)
dc.subject.keyword影像搜尋,空間關係驗證,物件檢索,擬似物件,zh_TW
dc.subject.keywordimage matching,spatial verification,object retrieval,pseudo-object,en
dc.relation.page82
dc.rights.note有償授權
dc.date.accepted2015-11-30
dc.contributor.author-college電機資訊學院zh_TW
dc.contributor.author-dept資訊網路與多媒體研究所zh_TW
顯示於系所單位:資訊網路與多媒體研究所

文件中的檔案:
檔案 大小格式 
ntu-104-1.pdf
  未授權公開取用
7.73 MBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved