利用 Sherman-Woodbury 公式之高效率多類別非線性最小平方誤差分類器

Chen-Wei Li; 李振維

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/32281

標題:	利用 Sherman-Woodbury 公式之高效率多類別非線性最小平方誤差分類器 Effective Multi-class Kernel MSE Classifier with Sherman-Woodbury Formula
作者:	Chen-Wei Li 李振維
指導教授:	陳正剛(Argon Chen)
關鍵字:	分類方法,費雪線性區別,核心費雪區別,最小平方誤差法,核心最小平方誤差法,效率, Classification method,FLD,KFD,MSE,KMSE,Efficiency,
出版年 :	2006
學位:	碩士
摘要:	線性分類方法大致上包含最小平方誤差法(Minimum Squared Error)和費雪線性區別(Fisher Linear Discriminant)。由於這兩種方法都是線性分類方法，為了能夠處理蘊含非線性特徵的實例(instance)，因此從最小平方誤差法衍生出以核心函數(kernel function)為根基的核心最小平方誤差法(Kernel Minimum Squared Error)，另外也從費雪線性區別衍生出以核心函數為根基的核心費雪區別(Kernel Fisher Discriminant)。這兩種方法都是把實例從原本的屬性空間(attribute space)投射到一個較高維度的特徵空間(feature space)裡，並在特徵空間裡使用線性分類方法。費雪線性區別和核心費雪區別的目的都是尋找一組方向使得訓練實例(training instances)投影在上面的區別分數(discriminant scores)可以提供最大的鑑別力來區分所有類別。然而，當資料包含大量的屬性或實例時，費雪線性區別和核心費雪區別將會很沒有效率。為了改善運算效能，我們用最小平方誤差法來處理線性分類問題。但在面對多類別的問題時，最小平方誤差法就會跟支持向量機器(Support Vector Machine)一樣，使用一對一(one-against-one)或一對多(one-against-the-rest)的方法。兩者都是無效率的方法，因為無法跟費雪線性區別和核心費雪區別一樣只用一個模型就可以處理多類別的問題。因此我們發展多類別最小平方誤差法(multi-class MSE)，使用雪曼-伍德布瑞(Sherman-Woodbury)公式來改善運算效能，並利用葛蘭-舒密特過程(Gram-Schmidt process)來決定類別標籤組合(class-labeling scheme)以處理多類別的問題。多類別最小平方誤差法的非線性應用稱作多類別核心最小平方誤差法(multi-class KMSE)。接著我們用一筆模擬的範例來描述這個方法的流程以及表現類別標籤組合的意義。最後我們用兩筆真實資料來對我們所提出的方法和傳統的分類方法進行比較。 In general, there are two kinds of linear classification methods: one is MSE, and the other is FLD. Because linear methods are not sufficient to analyze the data with nonlinear patterns, the nonlinear methods KMSE and KFD are hence developed from MSE and FLD, respectively. Both transform the instances from the original attribute space to the high-dimensional feature space and then linear methods are applied. The objective of FLD and KFD is to find the directions on which the projection of training instances can provide the maximal separability of classes. FLD and KFD are known to be inefficient for datasets with a large amount of attributes and instances, respectively. To improve the computing efficiency, we use MSE for linear classification problems. However, MSE, like SVM, can use only the one-against-one or the one-against-the-rest approach to solve the multi-class problems. Both are inefficient compared to FLD and KFD where only one model is built to discriminate multiple classes simultaneously. Thus, we develop the multi-class MSE with Sherman-Woodbury formula to improve the computation efficiency. It can deal with multiple classes simultaneously by a class-labeling scheme. The different class-labeling schemes are determined by the Gram-Schmidt process. The nonlinear application, multi-class KMSE, is also developed from the multi-class MSE. Then, a simulated example is used to show how the proposed method works and to visualize the meaning of the class-labeling scheme. Finally, two real-world datasets are used for comparing the proposed method with other conventional methods.
URI:	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/32281
全文授權:	有償授權
顯示於系所單位：	工業工程學研究所

文件中的檔案：

檔案	大小	格式
ntu-95-1.pdf 目前未授權公開取用	1.5 MB	Adobe PDF

顯示文件完整紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料（如：文字、圖片、PDF）並使其易於取用。