KPCA和KFD的屬性解釋以及分類法則之研究

Yi-Hung Chen; 陳宜鴻

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/38091

標題:	KPCA和KFD的屬性解釋以及分類法則之研究 Attribute Interpretation and Classification Rules behind Kernel Principal Component Analysis and Kernel Fisher Discriminants
作者:	Yi-Hung Chen 陳宜鴻
指導教授:	陳正剛
關鍵字:	分類方法,以核函數為根基的主成分分析,以核函數為根基的費雪區別方法, Classification method,Kernel principal component analysis,Kernel Fisher discriminants,
出版年 :	2005
學位:	碩士
摘要:	分類方法大致上包含非監督式分類(unsupervised classification)和監督式分類(supervised classification)兩個範疇，其中主成分分析(Principal Component Analysis)是屬於非監督式分類方法，費雪線性區別方法(Fisher Linear Discriminant)則是屬於監督式分類方法；由於這兩種方法都是線性分析方法，為了能夠處理蘊含非線性特徵的實例(instance)，因此從主成分分析衍生出以核函數(kernel function)為根基的主成分分析(Kernel Principal Component Analysis)，另外也從費雪線性區別方法衍生出以核函數為根基的費雪區別方法(Kernel Fisher Discriminants)，這兩種方法都是把實例從原本的屬性空間(attribute space)投射到一個較高維度的特徵空間(feature space)裡，並在特徵空間裡有效的對實例做樣式辨認(pattern recognition)或特徵提取(feature extraction)，但是在特徵空間裡同時也會失去屬性解釋(attribute interpretation)的功能，因此對於以核函數為根基的主成分分析，我們先在特徵空間裡利用適當的主成分得點(score)將實例分成兩個分區(partition)，再回到原本的屬性空間解釋屬性的意義，這個方法可以極小化重建誤差(reconstruction error)並突顯出較重要的屬性；對於以核函數為根基的費雪區別方法，我們先在特徵空間裡利用適當的區別得點將實例分成兩個分區，然後建構出區別函數使得預測的正確率可以達到最高。最後經由一些模擬和實際的例子證實，這兩個方法可以分別對非線性的非監督式和監督式分類方法提供屬性解釋，並且比線性的分類方法更有效率。 In general, classification methods can be divided into two categories: one is unsupervised classification and the other is supervised classification. PCA and FLD are linear multivariate analysis methods where PCA is an unsupervised classification method and FLD belongs to supervised classification methods. Because linear methods are not sufficient to analyze the data with nonlinear patterns, the nonlinear methods KPCA and KFD are hence extended from PCA and FLD, respectively. Both transform the instances from the original attribute space to the feature space which could be arbitrarily large, possibly an infinite dimensional space. The feature space is efficient for feature extraction and pattern recognition but we loss the meanings of the original attributes there. For attribute interpretation, we need to segment the instances with nonlinear patterns into several partitions where each partition has its own linear patterns. Then, we can apply linear methods in each partition respectively for attribute interpretation. For KPCA, the segmentation is manipulated in the feature space through the second and higher KPC score(s) and then come back to the original attribute space for attribute interpretation. For KFD, we segment the training instance by an appropriate second and higher KFD score(s) to construct a linear predictive model for future prediction. This segmentation turns KPCA to segmented PCA by minimizing the reconstruction error and transforms KFD to segmented FLD by maximizing the classification accuracy. To verify this method, simulated examples are first used to examine and compare the proposed methods against other conventional methods. Then, real-world data sets are used to validate the proposed methodologies.
URI:	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/38091
全文授權:	有償授權
顯示於系所單位：	工業工程學研究所

文件中的檔案：

檔案	大小	格式
ntu-94-1.pdf 目前未授權公開取用	545.53 kB	Adobe PDF

顯示文件完整紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料（如：文字、圖片、PDF）並使其易於取用。