座標下降法求解大規模二次漏失函數線性支持向量機

Cho-Jui Hsieh; 謝卓叡

Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/41305

Title:	座標下降法求解大規模二次漏失函數線性支持向量機 Coordinate Descent Method for Large-scale L2-loss Linear Support Vector Machines
Authors:	Cho-Jui Hsieh 謝卓叡
Advisor:	林智仁(Chih-Jen Lin)
Keyword:	線性支持向量機,文件分類,座標下降法, Linear support vector machine,Document classification,Coordinate descent,
Publication Year :	2009
Degree:	碩士
Abstract:	線性支持向量機(SVM)是分類大規模資料時很有用的方法。在文件分類和自然語言處理的問題中，特徵向量常常是稀疏的。在這篇論文中，我們提出一個新的座標下降法來求解二次漏失函數的線性支持向量機。我們提出的方法在每一步過程中固定其他變數，只針對某個變數做最小化。而針對這個變數最小化的過程是用牛頓法配上線性搜尋的技巧。我們的演算法會以線性的速度收斂到函數的最小值。因為在最佳化每個變數時，我們的演算法必須找到擁有某個特徵值得所有資料，所以比較適合處理能方便的取得這種資訊的訓練資料。實驗結果顯示出我們的方法比其他目前最新的方法例如Pegasos和Tron還快且穩定。 Linear support vector machines (SVM) are useful for classifying large-scale sparse data. Problems with sparse features are common in applications such as document classi cation and natural language processing. In this thesis, we propose a novel coordinate descent algorithm for training linear SVM with the L2-loss function. At each step, the proposed method minimizes a one-variable sub-problem while fixing other variables. The sub-problem is solved by Newton steps with the line search technique. The procedure globally converges at the linear rate. As each sub-problem involves only values of a corresponding feature, the proposed approach is suitable when accessing a feature is more convenient than accessing an instance. Experiments show that our method is more e cient and stable than state of the art methods such as Pegasos and TRON.
URI:	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/41305
Fulltext Rights:	有償授權
Appears in Collections:	資訊工程學系

Files in This Item:

File	Size	Format
ntu-98-1.pdf Restricted Access	2.4 MB	Adobe PDF

Show full item record

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets