以擬似物件提昇物件搜尋效能

Kuan-Hung Lin; 林冠宏

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/23004

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	徐宏民(Winston H. Hsu)
dc.contributor.author	Kuan-Hung Lin	en
dc.contributor.author	林冠宏	zh_TW
dc.date.accessioned	2021-06-08T04:37:22Z	-
dc.date.copyright	2009-08-20
dc.date.issued	2009
dc.date.submitted	2009-08-17
dc.identifier.citation	[1] M. Flickner, H. Sawhney, W. Niblack, J. Ashley, Q. Huang, B. Dom, M. Gorkani, J. Hafner, D. Lee, D. Petkovic, D. Steele, and P. Yanker, “Query by image and video content: the qbic system,” Computer, vol. 28, no. 9, pp. 23-32, 1995. [2] J. R. Smith and S.-F. Chang, “Visualseek: a fully automated content-based image query system,” in Proc. of ACM Multimedia,1996. [3] J. Sivic and A. Zisserman, “Video Google: a text retrieval approach to object matching in videos,” in Proc. of ICCV, 2003. [4] J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman, “Object retrieval with large vocabularies and fast spatial matching,” in Proc. of CVPR, 2007. [5] O. Chum, J. Philbin, J. Sivic, M. Isard, and A. Zisserman, “Total recall: automatic query expansion with a generative feature model for object retrieval,” in Proc. of ICCV, 2007. [6] J. Yang, Y. G. Jiang, A. G. Hauptmann, and C. W. Ngo, “Evaluating bag-of-visual-words representations in scene classification,” in Proc. of MIR, 2007. [7] K. Mikolajczyk and C. Schmid, “Scale & affine invariant interest point detectors,” International Journal of Computer Vision, vol. 60, no. 1, October 2004. [8] Mikolajczyk, T. Tuytelaars, C. Schmid, A. Zisserman, J. Matas, F. Schaffalitzky, T. Kadir, and L. Van Gool. “A comparison of affine region detectors,” International Journal of Computer Vision, vol. 65, no. 1-2, pp. 43-72, 2005. [9] D. G. Lowe, “Distinctive image features from scale-invariant keypoints,” International Journal of Computer Vision, vol. 60, no. 2, pp. 91-110, November 2004. [10] S. Lazebnik, C. Schmid, and J. Ponce, “Beyond bags of features: spatial pyramid matching for recognizing natural scene categories,” in Proc. of CVPR, 2006. [11] L. A. Barroso, J. Dean, and U. Holzle, “Web search for a planet: the Google cluster architecture,” IEEE Mirco, vol. 23, no. 2, pp. 22-28, March-April 2003. [12] O. Chum, J. Matas, and S. Obdrzalek. “Enhancing RANSAC by generalized model optimization,” in Proc. ACCV, 2004. [13] Y.-H. Yang, P.-T. Wu, C.-W. Lee, K.-H. Lin, W. H. Hsu, and H. H. Chen, “ContextSeer: context search and recommendation at query time for shared consumer photos,” in Proc. of ACM Multimedia, 2008. [14] G. Hamerly and C. Elkan, “Learning the k in k-means,” in Proc. of NIPS, 2003. [15] L. Rabiner and B.-H. Juang, Fundamentals of Speech Recognition, Prentice Hall PTR, 1993. [16] G. Schwarz, “Estimating the dimension of a model,” The Annals of Statistics, vol. 6, no. 2, pp. 461-464, March 1978. [17] M. A. Stephens. “EDF statistics for goodness of fit and some comparisons,” Journal of the American Statistical Association, vol. 69, no. 347, pp. 730-737, September 1974 [18] A Dempster, N. Laird, and D. Rubin. “Maximum likelihood from incomplete data via the EM algorithm,” Journal of the Royal Statistical Society, Series B, vol. 39, pp. 1-38, 1977.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/23004	-
dc.description.abstract	目前的物件搜尋系統多以視覺文字為基礎為影像建立包含區域特徵的特徵向量，並以向量空間中各特徵向量間的距離或相似度為基準進行搜尋的動作。然而，這樣的特徵向量雖然考慮到區域特徵，但因為以單一特徵向量代表一張影像，所以當影像中有多個物件以及背景時，各個物件的資訊便混而為一，該影像的特徵向量因此是混雜的且稀釋過的，無法代表影像中任何物件，導致物件搜尋效果不佳。我們提出擬似物件的概念，用以近似影像中的實際物件，並解決前述的問題。影像中的一個擬似物件是該影像在空間中有鄰近關係的特徵點子集合，每一子集合會有相對應的特徵向量，代表其涵蓋的局部區域，因此每張影像會有多個特徵點子集合，即多個擬似物件來代表影像中實際可能包含的物件。我們提出三種有效估計出擬似物件的方法，分別是格狀方法（Grid），G-means方法，和高斯混合模型-貝氏訊息準則方法（GMM-BIC）。我們在兩個大眾影像資料集進行物件搜尋實驗，證明所提出的三個方法的效能遠超過目前的物件搜尋系統。	zh_TW
dc.description.abstract	State-of-the-art object retrieval systems are mostly based on the bag-of-visual-words representation which encodes local appearance information of an image in a feature vector. A search is performed by comparing query object’s feature vector with those for database images. However, a database image vector generally carries mixed information of the entire image which may contain multiple objects and background. Search quality is degraded by such noisy (or diluted) feature vectors. We address this issue by introducing the concept of pseudo-objects to approximate candidate objects in database images. A pseudo-object is a subset of proximate feature points in an image with its own feature vector to represent a local area. We investigate effective methods (e.g., grid, G-means, and GMM-BIC) to estimate pseudo-objects. Experimenting over two consumer photo benchmarks, we demonstrate the proposed method significantly outperforming other state-of-the-art object retrieval algorithms.	en
dc.description.provenance	Made available in DSpace on 2021-06-08T04:37:22Z (GMT). No. of bitstreams: 1 ntu-98-R96922058-1.pdf: 1708236 bytes, checksum: 7a777c476d285c2e1a4c06828d5f2d91 (MD5) Previous issue date: 2009	en
dc.description.tableofcontents	摘要 i Abstract ii Chapter 1 Introduction 1 1.1 Vector Space Model for Image Retrieval 2 1.2 Object Retrieval 4 1.3 Bag-of-Words Representation 6 1.4 Spatial Pyramid Matching 9 1.5 Limitations of Prior Works 10 Chapter 2 Pseudo-Objects 15 2.1 The Grid Method 16 2.2 The G-means Method 17 2.3 The Gaussian Mixture Model with Bayesian Information Criterion Method 22 2.4 Image Scoring Based on Pseudo-Objects 25 Chapter 3 Evaluation 27 3.1 Benchmarks 27 3.2 Implementation 30 3.3 Results and Discussion 31 Chapter 4 Conclusions and Future Work 36 Bibliography 37
dc.language.iso	en
dc.title	以擬似物件提昇物件搜尋效能	zh_TW
dc.title	Boosting Object Retrieval by Estimating Pseudo-Objects	en
dc.type	Thesis
dc.date.schoolyear	97-2
dc.description.degree	碩士
dc.contributor.oralexamcommittee	李明穗(Ming-Sui Lee),林軒田(Hsuan-Tien Lin)
dc.subject.keyword	影像搜尋,物件搜尋,擬似物件,視覺文字,區域特徵,	zh_TW
dc.subject.keyword	image retrieval,object retrieval,pseudo-object,visual word,local feature,	en
dc.relation.page	37
dc.rights.note	未授權
dc.date.accepted	2009-08-17
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	資訊工程學研究所	zh_TW
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
ntu-98-1.pdf 目前未授權公開取用	1.67 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。