多核學習於電腦視覺之應用

Yen-Yu Lin; 林彥宇

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/23430

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	傅楸善(Chiou-Shann Fuh)
dc.contributor.author	Yen-Yu Lin	en
dc.contributor.author	林彥宇	zh_TW
dc.date.accessioned	2021-06-08T05:01:31Z	-
dc.date.copyright	2010-12-10
dc.date.issued	2010
dc.date.submitted	2010-11-22
dc.identifier.citation	[1] R. Ando and T. Zhang. A framework for learning predictive structures from multiple tasks and unlabeled data. J. Machine Learning Research, 6:1817–1853, 2005. [2] C. Atkeson, A. Moore, and S. Schaal. Locally weighted learning. Artificial Intelligence Review, 11(1-5):11–73, 1997. [3] F. Bach, G. Lanckriet, and M. Jordan. Multiple kernel learning, conic duality, and the smo algorithm. In Proc. Int’l Conf. Machine Learning, 2004. [4] S. Belongie, J. Malik, and J. Puzicha. Shape matching and object recognitionusing shape contexts. IEEE Trans. Pattern Analysis and Machine Intelligence, 24 (4):509–522, 2002. [5] A. Berg and J. Malik. Geometric blur for template matching. In Proc. Conf. Computer Vision and Pattern Recognition, pages 607–614, 2001. [6] A. Berg, T. Berg, and J. Malik. Shape matching and object recognition using low distortion correspondences. In Proc. Conf. Computer Vision and Pattern Recognition, pages 26–33, 2005. [7] O. Boiman, E. Shechtman, and M. Irani. In defense of nearest-neighbor based image classification. In Proc. Conf. Computer Vision and Pattern Recognition, 2008. [8] A. Bosch, A. Zisserman, and X. Muñoz. Image classification using random forests and ferns. In Proc. Int’l Conf. Computer Vision, 2007. [9] A. Bosch, A. Zisserman, and X.Muñoz. Representing shape with a spatial pyramid kernel. In Proc. ACM Conf. Image and Video Retrieval, pages 401–408, 2007. [10] Y. Boykov, O. Veksler, and R. Zabih. Fast approximate energy minimization via graph cuts. IEEE Trans. Pattern Analysis and Machine Intelligence, 23(11):1222–1239, 2001. [11] D. Cai, X. He, and J. Han. Semi-supervised discriminant analysis. In Proc. Int’l Conf. Computer Vision, 2007. [12] J. Carreira and C. Sminchisescu. Constrained parametric min-cuts for automatic object segmentation. In Proc. Conf. Computer Vision and Pattern Recognition, 2010. [13] R. Caruana. Multitask learning. Machine Learning, 28:41–75, 1997. [14] C.-C. Chang and C.-J. Lin. LIBSVM: A Library for Support VectorMachines, 2001. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm. [15] C.-P. Chen and C.-S. Chen. Lighting normalization with generic intrinsic illumination subspace for face recognition. In Proc. Int’l Conf. Computer Vision, pages 1089–1096, 2005. [16] H.-T. Chen, H.-W. Chang, and T.-L. Liu. Local discriminant embedding and its variants. In Proc. Conf. Computer Vision and Pattern Recognition, pages 846–853, 2005. [17] H. Cheng, K. Hua, and K. Vu. Constrained locally weighted clustering. In Proc. Int’l Conf. Very Large Data Bases, pages 90–101, 2008. [18] M. Christoudias, R. Urtasun, and T. Darrell. Bayesian localized multiple kernel learning. Technical report, EECS Dept., University of California, Berkeley, 2009. [19] M. Collins, R. Schapire, and Y. Singer. Logistic regression, AdaBoost and Bregman distances. Machine Learning, pages 158–169, 2002. [20] D. Comaniciu and P. Meer. Mean shift: A robust approach toward feature space analysis. IEEE Trans. Pattern Analysis and Machine Intelligence, 24(5):603–619, 2002. [21] T. Cox and M. Cox. Multidimentional Scaling. Chapman & Hall, London, 1994. [22] N. Cristianini, J. Shawe-Taylor, A. Elisseeff, and J. Kandola. On kernel-target alignment. In Advances in Neural Information Processing Systems, 2001. [23] J. Dai, S. Yan, X. Tang, and J. Kwok. Locally adaptive classification piloted by uncertainty. In Proc. Int’l Conf. Machine Learning, pages 225–232, 2006. [24] K. Dana, B. Van-Ginneken, S. Nayar, and J. Koenderink. Reflectance and texture of real world surfaces. ACM Trans. on Graphics, 18(1):1–34, 1999. [25] C. Domeniconi and M. Al-Razgan. Weighted cluster ensembles: Methods and analysis. ACM Trans. Knowledge Discovery from Data, 2(4), 2009. [26] C. Domeniconi and D. Gunopulos. Adaptive nearest neighbor classification using support vector machines. In Advances in Neural Information Processing Systems, 2001. [27] D. Dueck and B. Frey. Non-metric affinity propagation for unsupervised image categorization. In Proc. Int’l Conf. Computer Vision, 2007. [28] M. Everingham, A. Zisserman, C. K. I. Williams, and L. Van Gool. The PASCAL Visual Object Classes Challenge 2006 (VOC2006) Results. http://www.pascalnetwork.org/challenges/VOC/voc2006/results.pdf, 2006. [29] M. Everingham, L. Van Gool, C. K. I. Williams, J. Winn, and A. Zisserman. The PASCAL Visual Object Classes Challenge 2007 (VOC2007) Results. http://www.pascal-network.org/challenges/VOC/voc2007/workshop/index.html, 2007. [30] M. Everingham, L. Van Gool, C. K. I. Williams, J. Winn, and A. Zisserman. The PASCAL Visual Object Classes Challenge 2008 (VOC2008) Results. http://www.pascal-network.org/challenges/VOC/voc2008/workshop/index.html, 2008. [31] M. Everingham, L. Van Gool, C. K. I. Williams, J. Winn, and A. Zisserman. The PASCAL Visual Object Classes Challenge 2009 (VOC2009) Results. http://www.pascal-network.org/challenges/VOC/voc2009/workshop/index.html, 2009. [32] M. Everingham, L. Van Gool, C. K. I. Williams, J. Winn, and A. Zisserman. The PASCAL Visual Object Classes Challenge 2010 (VOC2010) Results. http://www.pascal-network.org/challenges/VOC/voc2010/workshop/index.html, 2010. [33] L. Fei-Fei, R. Fergus, and P. Perona. Learning generative visual models from few training examples: An incremental bayesian approach tested on 101 object categories. In CVPR Workshop on Generative-Model Based Vision, 2004. [34] A. Fred and A. Jain. Combining multiple clusterings using evidence accumulation. IEEE Trans. Pattern Analysis and Machine Intelligence, 27(6):835–850, 2005. 93 [35] Y. Freund and R. Schapire. A decision-theoretic generalization of on-line learning and an application to boosting. In Computational Learning Theory, 1995. [36] B. Frey and D. Dueck. Clustering by passing messages between data points. Science, 315:972–976, 2007. [37] J. Friedman. Local learning based on recursive covering. Technical report, Dept. of Statistics, Stanford University, 2005. [38] J. Friedman, T. Hastie, and R. Tibshirani. Additive logistic regression: A statistical view of boosting. Annals of Statistics, 28(2):337–374, 2000. [39] A. Frome, Y. Singer, and J. Malik. Image retrieval and classification using local distance functions. In Advances in Neural Information Processing Systems, pages 417–424, 2006. [40] A. Frome, Y. Singer, F. Sha, and J. Malik. Learning globally-consistent local distance functions for shape-based image retrieval and classification. In Proc. Int’l Conf. Computer Vision, 2007. [41] C. Galleguillos, B.McFee, S. Belongie, and G. Lanckriet. Multi-class object localization by combining local contextual interactions. In Proc. Conf. Computer Vision and Pattern Recognition, 2010. [42] P. Gehler and S. Nowozin. On feature combination for multiclass object classification. In Proc. Int’l Conf. Computer Vision, 2009. [43] M. Gönen and E. Alpaydin. Localized multiple kernel learning. In Proc. Int’l Conf. Machine Learning, pages 352–359, 2008. [44] K. Grauman and T. Darrell. The pyramid match kernel: Discriminative classification with sets of image features. In Proc. Int’l Conf. Computer Vision, pages 1458–1465, 2005. [45] G. Griffin, A. Holub, and P. Perona. Caltech-256 object category dataset. Technical report, California Institute of Technology, 2007. http://authors.library.caltech.edu/7694. [46] R. Gross and V. Brajovic. An image preprocessing algorithm for illumination invariant face recognition. In Audio-and Video-Based Biometrie Person Authentication, pages 10–18, 2003. [47] X. He and P. Niyogi. Locality preserving projections. In Advances in Neural Information Processing Systems, 2003. [48] S. Hoi, M. Lyu, and E. Chang. Learning the unified kernel machines for classification. In Proc. ACM Conf. Knowledge Discovery and Data Mining, pages 187–196, 2006. [49] A. Holub, M. Welling, and P. Perona. Combining generative models and fisher kernels for object recognition. In Proc. Int’l Conf. Computer Vision, pages 136–143, 2005. [50] I. Joliffe. Principal Component Analysis. Springer-Verlag, 1986. [51] A. Kapoor, K. Grauman, R. Urtasun, and T. Darrell. Gaussian processes for object categorization. Int’l J. Computer Vision, 88(2):169–188, 2010. [52] S.-J. Kim, A. Magnani, and S. Boyd. Optimal kernel selection in kernel fisher discriminant analysis. In Proc. Int’l Conf. Machine Learning, pages 465–472, 2006. [53] T.-K. Kim and J. Kittler. Locally linear discriminant analysis for multimodally distributed classes for face recognition with a single model image. IEEE Trans. Pattern Analysis and Machine Intelligence, 27(3):318–327, 2005. [54] B. Kulis, S. Basu, I. Dhillon, and R. Mooney. Semi-supervised graph clustering: A kernel approach. In Proc. Int’l Conf. Machine Learning, pages 457–464, 2005. [55] A. Kumar and C. Sminchisescu. Support kernel machines for object recognition. In Proc. Int’l Conf. Computer Vision, 2007. [56] G. Lanckriet, N. Cristianini, P. Bartlett, L. Ghaoui, and M. Jordan. Learning the kernel matrix with semi-definite programming. In Proc. Int’l Conf.Machine Learning, pages 323–330, 2002. [57] G. Lanckriet, N. Cristianini, P. Bartlett, L. Ghaoui, and M. Jordan. Learning the kernel matrix with semidefinite programming. J. Machine Learning Research, 5: 27–72, 2004. [58] S. Lazebnik, C. Schmid, and J. Ponce. Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In Proc. Conf. Computer Vision and Pattern Recognition, pages 2169–2178, 2006. [59] T. Leung and J. Malik. Representing and recognizing the visual appearance of materials using three-dimensional textons. Int’l J. Computer Vision, 43(1):29–44, 2001. [60] Y.-Y. Lin, T.-L. Liu, and C.-S. Fuh. Local ensemble kernel learning for object category recognition. In Proc. Conf. Computer Vision and Pattern Recognition, 2007. [61] Y.-Y. Lin, T.-L. Liu, and C.-S. Fuh. Dimensionality reduction for data in multiple feature representations. In Advances in Neural Information Processing Systems, pages 961–968, 2008. [62] Y.-Y. Lin, J.-F. Tsai, and T.-L. Liu. Efficient discriminative local learning for object recognition. In Proc. Int’l Conf. Computer Vision, 2009. [63] D. Lowe. Distinctive image features from scale-invariant keypoints. Int’l J. Computer Vision, 60(2):91–110, 2004. [64] T.Malisiewicz and A. Efros. Recognition by association via learning per-exemplar distances. In Proc. Conf. Computer Vision and Pattern Recognition, 2008. [65] S.Mika, G. Rätsch, J.Weston, B. Schölkopf, and K.-R.Müller. Fisher discriminant analysis with kernels. In Neural Networks for Signal Processing, pages 41–48, 1999. [66] B. Moghaddam and G. Shakhnarovich. Boosted dyadic kernel discriminants. In Advances in Neural Information Processing Systems, 2002. [67] J. Mutch and D. Lowe. Multiclass object recognition with sparse, localized features. In Proc. Conf. Computer Vision and Pattern Recognition, pages 11–18, 2006. [68] A. Ng, M. Jordan, and Y. Weiss. On spectral clustering. In Advances in Neural Information Processing Systems, 2001. [69] T. Ojala, M. Pietikäinen, and T. Mäenpää. Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans. Pattern Analysis and Machine Intelligence, 24(7):971–987, 2002. [70] A. Oliva and A. Torralba. Modeling the shape of the scene: A holistic representation of the spatial envelope. Int’l J. Computer Vision, 42(3):145–175, 2001. [71] G. Pass, R. Zabih, and J. Miller. Comparing images using color coherence vectors. In Proc. ACM Conf. Multimedia, pages 65–73, 1996. [72] E. Pekalska, P. Paclik, and R. Duin. A generalized kernel approach to dissimilaritybased classification. J. Machine Learning Research, 2(2):175–211, 2002. [73] J. Platt, N. Cristianini, and J. Shawe-Taylor. Large margin dags for multiclass classification. In Advances in Neural Information Processing Systems, pages 547–553, 1999. [74] A. Rakotomamonjy, F. Bach, S. Canu, and Y. Grandvalet. More efficiency in multiple kernel learning. In Proc. Int’l Conf. Machine Learning, pages 775–782, 2007. [75] C. E. Rasmussen and C. K. I.Williams. Gaussian Processes forMachine Learning. MIT Press, 2006. [76] V. Roth and T. Lange. Feature selection in clustering problems. In Advances in Neural Information Processing Systems, 2003. [77] S. Roweis and L. Saul. Nonlinear dimensionality reduction by locally linear embedding. Science, 290:2323–2326, 2000. [78] H. Schneiderman and T. Kanade. A statistical method for 3d object detection applied to faces and cars. In Proc. Conf. Computer Vision and Pattern Recognition, pages 1746–1759, 2000. [79] B. Schölkopf and A. Smola. Learning with Kernels. MIT Press, 2002. [80] B. Schölkopf, A. Smola, and K.-R. Müller. Nonlinear component analysis as a kernel eigenvalue problem. Neural Computation, 10(5):1299–1319, 1998. [81] T. Serre, L.Wolf, and T. Poggio. Object recognition with features inspired by visual cortex. In Proc. Conf. Computer Vision and Pattern Recognition, pages 994–1000, 2005. [82] E. Shechtman and M. Irani. Matching local self-similarities across images and videos. In Proc. Conf. Computer Vision and Pattern Recognition, 2007. [83] J. Shi and J.Malik. Normalized cuts and image segmentation. IEEE Trans. Pattern Analysis and Machine Intelligence, 22(8):888–905, 2000. [84] T. Sim, S. Baker, and M. Bsat. The cmu pose, illumination, and expression database. IEEE Trans. Pattern Analysis and Machine Intelligence, 25(12):1615–1618, 2005. [85] J. Sivic and A. Zisserman. Video google: A text retrieval approach to object matching in videos. In Proc. Int’l Conf. Computer Vision, pages 1470–1477, 2003. [86] A. Smola and R. Kondor. Kernels and regularization on graphs. In Proc. Ann. Conf. Computational Learning Theory, pages 144–158, 2003. [87] S. Sonnenburg, G. Rätsch, C. Schäfer, and B. Schölkopf. Large scale multiple kernel learning. J. Machine Learning Research, 7:1531–1565, 2006. [88] A. Strehl and J. Ghosh. Cluster ensembles – a knowledge reuse framework for combining multiple partitions. J. Machine Learning Research, 3:583–617, 2002. [89] J. Tenenbaum, V. de Silva, and J. Langford. A global geometric framework for nonlinear dimensionality reduction. Science, 290:2319–2323, 2000. [90] The MOSEK Optimization Software. http://www.mosek.com/index.html. [91] S. Todorovic and N. Ahuja. Learning subcategory relevances for category recognition. In Proc. Conf. Computer Vision and Pattern Recognition, 2008. [92] A. Torralba, K. Murphy, and W. Freeman. Sharing visual features for multiclass and multiview object detection. IEEE Trans. Pattern Analysis and Machine Intelligence, 29(5):854–869, 2007. [93] M. Turk and A. Pentland. Eigenfaces for recognition. Journal of Cognitive Neuroscience, 3(1):71–86, 1991. [94] O. Tuzel, F. Porikli, and P. Meer. Kernel methods forweakly supervised mean shift clustering. In Proc. Int’l Conf. Computer Vision, 2009. [95] K. van de Sande, T. Gevers, and C. Snoek. Evaluation of color descriptors for object and scene recognition. In Proc. Conf. Computer Vision and Pattern Recognition, 2008. [96] J. C. van Gemert, C. J. Veenman, A. W. M. Smeulders, and J. M. Geusebroek. Visual word ambiguity. IEEE Trans. Pattern Analysis and Machine Intelligence, 32(7):1271–1283, 2009. [97] L. Vandenberghe and S. Boyd. Semidefinite programming. SIAM Review, 38: 49–95, 1996. [98] V. Vapnik. Statistical Learning Theory. Wiley, 1998. [99] M. Varma and D. Ray. Learning the discriminative power-invariance trade-off. In Proc. Int’l Conf. Computer Vision, 2007. [100] M. Varma and A. Zisserman. A statistical approach to texture classification from single images. Int’l J. Computer Vision, 62(1-2):61–81, 2005. [101] A. Vedaldi, V. Gulshan, M. Varma, and A. Zisserman. Multiple kernels for object detection. In Proc. Int’l Conf. Computer Vision, 2009. [102] P. Viola and M. Jones. Robust real-time face detection. Int’l J. Computer Vision, 57(2):137–154, 2004. [103] H. Wang, S. Yan, D. Xu, X. Tang, and T. Huang. Trace ratio vs. ratio trace for dimensionality reduction. In Proc. Conf. Computer Vision and Pattern Recognition, 2007. [104] L. Wolsey. Integer Programming. John Wiley & Sons, 1998. [105] M.Wu and B. Schölkopf. A local learning approach for clustering. In Advances in Neural Information Processing Systems, pages 1529–1536, 2006. [106] E. Xing, A. Ng, M. Jordan, and S. Russell. Distance metric learning with application to clustering with side-information. In Advances in Neural Information Processing Systems, 2002. [107] L. Xu, J. Neufeld, B. Larson, and D. Schuurmans. Maximum margin clustering. In Advances in Neural Information Processing Systems, 2004. [108] S. Yan, D. Xu, B. Zhang, H. Zhang, Q. Yang, and S. Lin. Graph embedding and extensions: A general framework for dimensionality reduction. IEEE Trans. Pattern Analysis and Machine Intelligence, 29(1):40–51, 2007. [109] J. Yang, Y. Li, Y. Tian, L. Duan, and W. Gao. Group-sensitive multiple kernel learning for object categorization. In Proc. Int’l Conf. Computer Vision, 2009. [110] J. Ye, R. Janardan, and Q. Li. Two-dimensional linear discriminant analysis. In Advances in Neural Information Processing Systems, 2004. [111] J. Ye, R. Janardan, and Q. Li. Gpca: An efficient dimension reduction scheme for image compression and retrieval. In Proc. ACM Conf. Knowledge Discovery and Data Mining, pages 354–363, 2004. [112] J. Ye, Z. Zhao, and M. Wu. Discriminative k-means for clustering. In Advances in Neural Information Processing Systems, 2007. [113] H. Zhang, A. Berg, M. Maire, and J. Malik. Svm-knn: Discriminative nearest neighbor classification for visual category recognition. In Proc. Conf. Computer Vision and Pattern Recognition, pages 2126–2136, 2006. [114] B. Zhao, F.Wang, and C. Zhang. Efficient multiclass maximum margin clustering. In Proc. Int’l Conf. Machine Learning, pages 1248–1255, 2008. [115] J. Zhu, S. Rosset, H. Zou, and T. Hastie. Multi-class adaboost. Technical report, Dept. of Statistics, University of Michigan, 2005.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/23430	-
dc.description.abstract	這篇論文旨在探討與處理擁有多種特徵表示之資料的電腦視覺應用，並利用不同資料特徵之間的互補性，來提高這些應用的效能。為了達成這個目的，我們提出一些多核學習的技術，並用於實現異質特徵表示的結合，以助於電腦視覺應用的完成。這篇論文共包含了三個部分，每一部分裡所發展出的多核學習技術均是各自獨立與自我包含的，但是所有的技術都具有一個共同的主題：每一種特徵表示下的資料均是用一個核矩陣來記錄其兩兩之間的關係，也就是核矩陣將成為一致的資料特徵表示。在這個設定下，不同的特徵表示將以核矩陣之型式來進行結合，另一方面，這也代表著這些電腦視覺應用將可與目前最佳的機器學習方法—核機器—有直接的連結。在論文的第一部分中，我們著重於將多特徵表示下的資料，用單一且簡潔的型式來表示，這是有鑑於多特徵表示下的資料通常是高維度的，且不同特徵表示往往具有不同型式的表示，把資料投影至一個低維度的歐氏空間將有助於許多電腦視覺應用的完成，如分類、辨識與分群。因此，我們所提出的方法(稱為MKL-DR)將多核學習演算法融入至降維分析中，多核學習是用於異質特徵的結合，而降維分析是用於投影資料至低維度空間。MKL-DR 有以下三個特色：首先，MKL-DR支援各種不同型式的特徵表示之使用，因此我們可以更精確地描述資料不同面象的特徵；第二，MKL-DR 讓許多現有之降維分析演算法可以同時處理多個核的學習，並增加這些降維分析演算法的效能。第三，透過不同的降維分析演算法之目標函式，MKL-DR 之模型增加了多核學習可應用的範圍，它將多核學習從原先僅能處理監督式學習之問題，擴展至可處理半監督式與非監督式學習的問題。在論文的第二部分中，我們提出區域合成核的概念，並用於調適性特徵結合。由於電腦視覺的應用常需要處理具高度變異性之資料，就辨識來說，最佳的資料特徵往往會隨資料的不同而改變，因此對於所有之資料僅學習出一組全域性的特徵結合，便無法對多種特徵作出最佳的利用。我們建議為每一筆的訓練資料學習一個區域分類器，而在學習這個區域分類器時，是針對以該筆資料的中心之鄰近區域來進行最佳化的，包括選取最佳的特徵結合與分類器參數。透過區域分類器的學習，我們達到了區域調適性特徵結合，然而這需要學習出跟訓練資料一樣多區域分類器，這在處理大型的資料集時，於計算時間上就變得不可行。為了解決這個問題，我們提出一個新的訓練流程，它將多個區域分類器的訓練過程轉化為一個多任務學習的問題，我們並發表了一個多任務提昇演算法來實現所提出的訓練流程，這個架構不但顯著地加速區域分類器的學習，同時也降低了過適現象的發生率。在論文的第三部分中，我們將監督式的多核學習技術運用在半監督式與非監督式的分群任務中。有別於前兩個部分，這個部分的難度根於非監督式分群的問題設定：我們沒有已標示的資料來推論與學習最佳合成核所應有的組成係數。為了解決這個問題，我們藉由一個最佳化問題來連結監督式的多核學習與非監督式的分群任務，此外我們亦發展出一個新的分群演算法，它能有效地從複雜的資料集中偵測出每群資料的一致性結構，並實現以群為單位的特徵選取。具體來說，我們把一群資料連結一個提昇分類器，這個分類器可實現多核學習，並用於選取最佳特徵組合來區隔屬於這群的資料與不屬於這群的資料，這些分類器的學習程序將融入於分群的任務中，並藉由疊代的方式來解整個最佳化問題。在最佳化的過程中，每群的結構將逐漸地透過所屬之分類器顯露出來，另一方面，這些分類器的效能亦藉著資料標示正確率的增加而提高。	zh_TW
dc.description.abstract	In this thesis, we aim to better address several important computer vision problems when multiple feature representations of data are available. To achieve this goal, we have proposed novel multiple kernel learning (MKL) techniques to carry out feature combination and facilitate the underlying vision tasks. The thesis consists of three parts. The proposed approaches in the three parts are self-contained but highly correlative. The common theme among them is that kernel matrices will serve as the unified feature representations for data under different descriptors. Based upon this setting, feature fusion can be carried out in the domain of kernel matrices, while the applications will have a direct connection to kernel machines, the best off-the-shelf machine learning methodologies. In the first part of the thesis, we aim to provide a unified and compact view of data with multiple feature representations. It is motivated by the fact that the feature representations of data under various descriptors are typically high dimensional and assume diverse forms. Finding a way to transform them into an Euclidean space of lower dimension hence generally facilitates the underlying vision tasks, such as recognition or clustering. To this end, the proposed approach (termed as MKL-DR) generalizes the framework of multiple kernel learning for dimensionality reduction, and distinguishes itself with the following three main contributions. First, our method provides the convenience of using diverse image descriptors to describe useful characteristics of various aspects about the underlying data. Second, it extends a broad set of existing dimensionality reduction techniques to consider multiple kernel learning, and consequently improves their effectiveness. Third, by focusing on the techniques pertaining to dimensionality reduction, the formulation introduces a new class of applications with the multiple kernel learning framework to address not only the supervised learning problems but also the unsupervised and semisupervised ones. In the second part, we propose the local ensemble kernels for adaptive feature fusion. As data often display large interclass and intraclass variations in complex vision tasks, the optimal feature combination for recognition/clustering may vary form data class to class. Learning a global feature fusion may not make the most use of multiple features. We hence suggest to learn a set of local classifiers, and each of them is derived for one training sample and optimized in a way to give good classification performances for data falling within the corresponding local area. Since local classifiers as many as the training samples are required to learn in this setting, it is important to keep the computational cost feasible even if datasets of large sizes are considered. Specifically, we cast the multiple, independent training processes of local classifiers as a correlative multi-task learning problem, and design a new boosting algorithm to accomplish these tasks simultaneously and with higher efficiency. In the last part of this thesis, we illustrate how to integrate supervisedMKL techniques for feature combination into the unsupervised/semi-supervised clustering tasks. The intrinsic difficulty of this part results from the unsupervised nature of clustering: We have no labeled data to guide the searching of the optimal ensemble kernel over a given convex set of base kernels. Our key idea for handling the difficulty is to cast the tasks of supervised feature selection and unsupervised clustering procedure into a joint optimization problem. Besides, we describe a clustering approach with the emphasis on detecting coherent structures in a complex dataset, and consider cluster-dependent feature selection. Specifically, we associate each cluster with a boosting classifier derived from multiple kernel learning, and apply the cluster-specific classifier to performing feature selection cross various descriptors to best separate data of the cluster from the rest. We integrate the multiple, correlative training tasks of the cluster-specific classifiers into the clustering procedure. Through iteratively solving the joint optimization problem, the cluster structure is gradually revealed by these classifiers, while their discriminant power to capture similar data would be progressively improved owing to better data labeling.	en
dc.description.provenance	Made available in DSpace on 2021-06-08T05:01:31Z (GMT). No. of bitstreams: 1 ntu-99-D94922027-1.pdf: 3126225 bytes, checksum: 50b414fd3f2d04e868e8b23ef3f8b0e1 (MD5) Previous issue date: 2010	en
dc.description.tableofcontents	1 Introduction . . . . . . . . . . . . . . . . . . . . . 1 1.1 Preliminary of Multiple Kernel Learning . . . . . . . 3 1.2 Thesis Overview . . . . . . .. . . . . . . . . . . . . 7 1.2.1 Multiple Kernel Learning for Dimensionality Reduction . . . 7 1.2.2 Multiple Kernel Learning for Local Learning .. . . . 8 1.2.3 Multiple Kernel Learning for Unsupervised Clustering . . . 11 1.3 Thesis Organization . . . . . . . . . . . . . . . . . 12 2 Multiple Kernel Learning for Dimensionality Reduction 15 2.1 Literature Review . . . . . . . . . . . . . . . . . . 16 2.1.1 Dimensionality Reduction . . . . . . . . . . . . . 16 2.1.2 Graph Embedding . . . . . . . . . . . . . . . . . . 18 2.1.3 Multiple Kernel Learning . . . . . . . . . . . . . 19 2.1.4 Dimensionality Reduction with Multiple Kernels .. . 20 2.2 The MKL-DR Framework . . . . . . . . . . . . . . . . 20 2.2.1 Kernel as A Unified Feature Representation . . . . 21 2.2.2 The MKL-DR Algorithm .. . . . . . . . . . . . . . . 22 2.2.3 Optimization . . . . .. . . . . . . . . . . . . . . 25 2.2.4 Novel Sample Embedding. . . . . . . . . . . . . . . 28 2.3 Experimental Results I: Supervised Learning for Object Categorization . . . 28 2.3.1 Dataset . . . . . . . . . . . . . . . . . . . . . . 29 2.3.2 Image Descriptors and Base Kernels .. . . . . . . . 30 2.3.3 Dimensionality Reduction Methods . .. . . . . . . . 31 2.3.4 Quantitative Results . . . . .. . . . . . . . . . . 33 2.4 Experimental Results II: Unsupervised Learning for Image Clustering . . . 38 2.4.1 Dataset . . . . . . . . . . . . . . . . . . . . . . 38 2.4.2 Image Descriptors and Base Kernels .. . . . . . . . 38 2.4.3 Dimensionality Reduction Method . . . . . . . . . . 39 2.4.4 Quantitative Results . . . . . . . .. . . . . . . . 39 2.5 Experimental Results III: Semisupervised Learning for Face Recognition . . . 42 2.5.1 Dataset . . . . . . . . . . . . . . . . . . . . . . 42 2.5.2 Image Descriptors and Base Kernels . . . . . . . . 44 2.5.3 Dimensionality Reduction Method . . . . . . . . . . 45 2.5.4 Quantitative Results . . . . . . . .. . . . . . . . 46 2.6 Summary . . . . . . . . . . . . . . . . . . . . . . . 48 3 Multiple Kernel Learning for Local Learning . . . . . . 51 3.1 Motivation . . . . . . . . . . . . . . . . . . . . . 51 3.2 Literature Review . . . . . . . . . . . . . . . . . . 53 3.3 Information Fusion via Kernel Alignment . . . . . . . 54 3.3.1 Kernel Alignment . . . . .. . . . . . . . . . . . . 55 3.3.2 Feature Fusion via Kernel Alignment . . . . . . . . 55 3.4 Local Ensemble Kernel Learning . . . . . . . . . . . 57 3.4.1 Initialization via Localized Kernel Alignment . . . 58 3.4.2 Local Ensemble Kernel Optimization . . . . . . . . 59 3.5 Experimental Results . . . . . . . . . . . . . . . . 61 3.5.1 Image Descriptors and Base Kernels . . . . . . . . 61 3.5.2 Caltech-101 Dataset . . . . . . . . . . . . . . . . 63 3.5.3 Corel + CUReT + Caltech-101 Dataset . . . . . . . . 65 3.5.4 Complexity Analysis . . . . . . . . . . . . . . . . 66 3.6 Summary . . . . . . . . . . . . . . . . . . . . . . . 67 4 Localized Multiple Kernel Learning via Multi-task Boosting . . . 69 4.1 Motivation . . . . . . . . . . . . . . . . . . . . . 70 4.2 Literature Review . . . . . . . . . . . . . . . . . . 70 4.2.1 Local Learning . . . . . .. . . . . . . . . . . . . 71 4.2.2 Multi-task Learning . . . . . . . . . . . . . . . . 72 4.3 The Proposed Framework . . .. . . . . . . . . . . . . 72 4.3.1 Notations . . . . . . . . . . . . . . . . . . . . . 72 4.3.2 Problem Definition . . . .. . . . . . . . . . . . . 73 4.3.3 Formulation . . . . . . . . . . . . . . . . . . . . 74 4.4 Multi-task Boosting . . . . . . . . . . . . . . . . . 76 4.4.1 Design of Weak Learners . . . . . . . . . . . . . . 76 4.4.2 The Boosting Algorithm .. . . . . . . . . . . . . . 77 4.4.3 Useful Properties . . . . . . . . . . . . . . . . . 79 4.5 Experimental results . . .. . . . . . . . . . . . . . 80 4.5.1 Caltech-101 . . . . . . . . . . . . . . . . . . . . 81 4.5.2 Pascal VOC 2007 . . . . . . . . . . . . . . . . . . 84 4.6 Summary . . . . . . . . . . . . . . . . . . . . . . . 87 5 Multiple Kernel Learning for Unsupervised Clustering. . 89 5.1 Motivation . . . . . . . . . . . . . . . . . . . . . 90 5.2 Literature Review . . . . . . . . . . . . . . . . . . 92 5.2.1 Clustering . . . . . . . . . . . . . . . . . . . . 92 5.2.2 Clustering with Feature Selection . . . . . . . . . 92 5.2.3 Ensemble Clustering . . . . . . . . . . . . . . . . 93 5.3 Problem Definition . . . . . . . . . . . . . . . . . 94 5.3.1 Notations . . . . . . . . . . . . . . . . . . . . . 94 5.3.2 Formulation . . . . . . . . . . . . . . . . . . . . 95 5.4 Optimization Procedure . . . . . . . . . . . . . . . 96 5.4.1 On Learning Cluster-specific Classifiers .. . . . . 97 5.4.2 On Assigning Data into Clusters . . . . . . . . . . 98 5.4.3 Implementation Details . . . . . . . . . . . . . . 100 5.5 Experimental Results . . . . . . . . . . . . . . . . 101 5.5.1 Visual Object Categorization . . . . . . . . . . . 101 5.5.2 Face Image Grouping . . . . . . . . . . . . . . . 105 5.6 Summary . . . . . . . . .. . . . . . . . . . . . . . 108 6 Conclusions and Future Work . . . . . . . .. . . . . . 109 6.1 Conclusions . . . . . . . . . .. . . . . . . . . . . 109 6.2 Future Work . . . . . . . . . .. . . . . . . . . . . 112 6.2.1 New Learning Scenarios for Multiple Kernel Learning . . . . 112 6.2.2 Multiple Kernel Learning for Large-scale Data Analysis . . . 113 Bibliography . . . . . . . . . . . . . . . . . . . . . . 115 Publications . . . . . . . . . . . . . . . . . . . . . . 125
dc.language.iso	en
dc.subject	區域學習	zh_TW
dc.subject	降維分析	zh_TW
dc.subject	核方法	zh_TW
dc.subject	影像分群	zh_TW
dc.subject	多核學習	zh_TW
dc.subject	人臉識別	zh_TW
dc.subject	特徵結合	zh_TW
dc.subject	物件辨識	zh_TW
dc.subject	材質分類	zh_TW
dc.subject	提昇演算法	zh_TW
dc.subject	多任務學習	zh_TW
dc.subject	object categorization	en
dc.subject	multiple kernel learning	en
dc.subject	feature fusion	en
dc.subject	kernel method	en
dc.subject	dimensionality reduction	en
dc.subject	local learning	en
dc.subject	multi-task learning	en
dc.subject	boosting algorithm	en
dc.subject	image clustering	en
dc.subject	texture classification	en
dc.subject	face recognition	en
dc.title	多核學習於電腦視覺之應用	zh_TW
dc.title	Multiple Kernel Learning for Computer Vision Applications	en
dc.type	Thesis
dc.date.schoolyear	99-1
dc.description.degree	博士
dc.contributor.coadvisor	劉庭祿(Tyng-Luh Liu)
dc.contributor.oralexamcommittee	洪一平(Yi-Ping Hung),莊永裕(Yung-Yu Chuang),賴尚宏(Shang-Hong Lai),陳煥宗(Hwann-Tzong Chen),莊仁輝(Jen-Hui Chuang),廖弘源(Hong-Yuan Liao),陳祝嵩(Chu-Song Chen)
dc.subject.keyword	多核學習,特徵結合,核方法,降維分析,區域學習,多任務學習,提昇演算法,物件辨識,影像分群,材質分類,人臉識別,	zh_TW
dc.subject.keyword	multiple kernel learning,feature fusion,kernel method,dimensionality reduction,local learning,multi-task learning,boosting algorithm,object categorization,image clustering,texture classification,face recognition,	en
dc.relation.page	126
dc.rights.note	未授權
dc.date.accepted	2010-11-22
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	資訊工程學研究所	zh_TW
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
ntu-99-1.pdf 未授權公開取用	3.05 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。