KPCA和KFD的屬性解釋以及分類法則之研究

Yi-Hung Chen; 陳宜鴻

Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/38091

Title:	KPCA和KFD的屬性解釋以及分類法則之研究 Attribute Interpretation and Classification Rules behind Kernel Principal Component Analysis and Kernel Fisher Discriminants
Authors:	Yi-Hung Chen 陳宜鴻
Advisor:	陳正剛
Keyword:	分類方法,以核函數為根基的主成分分析,以核函數為根基的費雪區別方法, Classification method,Kernel principal component analysis,Kernel Fisher discriminants,
Publication Year :	2005
Degree:	碩士
Abstract:	分類方法大致上包含非監督式分類(unsupervised classification)和監督式分類(supervised classification)兩個範疇，其中主成分分析(Principal Component Analysis)是屬於非監督式分類方法，費雪線性區別方法(Fisher Linear Discriminant)則是屬於監督式分類方法；由於這兩種方法都是線性分析方法，為了能夠處理蘊含非線性特徵的實例(instance)，因此從主成分分析衍生出以核函數(kernel function)為根基的主成分分析(Kernel Principal Component Analysis)，另外也從費雪線性區別方法衍生出以核函數為根基的費雪區別方法(Kernel Fisher Discriminants)，這兩種方法都是把實例從原本的屬性空間(attribute space)投射到一個較高維度的特徵空間(feature space)裡，並在特徵空間裡有效的對實例做樣式辨認(pattern recognition)或特徵提取(feature extraction)，但是在特徵空間裡同時也會失去屬性解釋(attribute interpretation)的功能，因此對於以核函數為根基的主成分分析，我們先在特徵空間裡利用適當的主成分得點(score)將實例分成兩個分區(partition)，再回到原本的屬性空間解釋屬性的意義，這個方法可以極小化重建誤差(reconstruction error)並突顯出較重要的屬性；對於以核函數為根基的費雪區別方法，我們先在特徵空間裡利用適當的區別得點將實例分成兩個分區，然後建構出區別函數使得預測的正確率可以達到最高。最後經由一些模擬和實際的例子證實，這兩個方法可以分別對非線性的非監督式和監督式分類方法提供屬性解釋，並且比線性的分類方法更有效率。 In general, classification methods can be divided into two categories: one is unsupervised classification and the other is supervised classification. PCA and FLD are linear multivariate analysis methods where PCA is an unsupervised classification method and FLD belongs to supervised classification methods. Because linear methods are not sufficient to analyze the data with nonlinear patterns, the nonlinear methods KPCA and KFD are hence extended from PCA and FLD, respectively. Both transform the instances from the original attribute space to the feature space which could be arbitrarily large, possibly an infinite dimensional space. The feature space is efficient for feature extraction and pattern recognition but we loss the meanings of the original attributes there. For attribute interpretation, we need to segment the instances with nonlinear patterns into several partitions where each partition has its own linear patterns. Then, we can apply linear methods in each partition respectively for attribute interpretation. For KPCA, the segmentation is manipulated in the feature space through the second and higher KPC score(s) and then come back to the original attribute space for attribute interpretation. For KFD, we segment the training instance by an appropriate second and higher KFD score(s) to construct a linear predictive model for future prediction. This segmentation turns KPCA to segmented PCA by minimizing the reconstruction error and transforms KFD to segmented FLD by maximizing the classification accuracy. To verify this method, simulated examples are first used to examine and compare the proposed methods against other conventional methods. Then, real-world data sets are used to validate the proposed methodologies.
URI:	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/38091
Fulltext Rights:	有償授權
Appears in Collections:	工業工程學研究所

Files in This Item:

File	Size	Format
ntu-94-1.pdf Restricted Access	545.53 kB	Adobe PDF

Show full item record

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets