座標下降法求解大規模二次漏失函數線性支持向量機

Cho-Jui Hsieh; 謝卓叡

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/41305

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	林智仁(Chih-Jen Lin)
dc.contributor.author	Cho-Jui Hsieh	en
dc.contributor.author	謝卓叡	zh_TW
dc.date.accessioned	2021-06-15T00:15:32Z	-
dc.date.available	2009-07-16
dc.date.copyright	2009-07-16
dc.date.issued	2009
dc.date.submitted	2009-06-16
dc.identifier.citation	D. P. Bertsekas. Nonlinear Programming. Athena Scienti c, Belmont, MA 02178-9998, second edition, 1999. B. E. Boser, I. Guyon, and V. Vapnik. A training algorithm for optimal margin classi ers. In Proceedings of the Fifth Annual Workshop on Computational Learning Theory, pages 144{152. ACM Press, 1992. L. Bottou. Stochastic gradient descent examples, 2007. http://leon.bottou.org/projects/sgd. L. Bottou and O. Bousquet. The tradeo s of large scale learning. In J. Platt, D. Koller, Y. Singer, and S. Roweis, editors, Advances in Neural Information Processing Systems 20. MIT Press, Cambridge, MA, 2008. K.-W. Chang, C.-J. Hsieh, and C.-J. Lin. Coordinate descent method for largescale L2-loss linear SVM. Journal of Machine Learning Research, 9:1369{1398, 2008. URL http://www.csie.ntu.edu.tw/~cjlin/papers/cdl2.pdf. M. Dud k, S. J. Phillips, and R. E. Schapire. Performance guarantees for regularized maximum entropy density estimation. In Proceedings of the 17th Annual Conference on Computational Learning Theory, pages 655{662, New York, 2004. ACM press. I. S. Du , R. G. Grimes, and J. G. Lewis. Sparse matrix test problems. ACM Transactions on Mathematical Software, 15:1{14, 1989. L. Grippo and M. Sciandrone. Globally convergent block-coordinate techniques for unconstrained optimization. Optimization Methods and Software, 10:587{637, 1999. C.-J. Hsieh, K.-W. Chang, C.-J. Lin, S. S. Keerthi, and S. Sundararajan. A dual coordinate descent method for large-scale linear SVM. In Proceedings of the Twenty Fifth International Conference on Machine Learning (ICML), 2008. URL http://www.csie.ntu.edu.tw/~cjlin/papers/cddual.pdf. Software available at http://www.csie.ntu.edu.tw/~cjlin/liblinear. T. Joachims. Training linear SVMs in linear time. In Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD). ACM, 2006. S. S. Keerthi and D. DeCoste. A modi ed nite Newton method for fast solution of large scale linear SVMs. Journal of Machine Learning Research, 6:341{361, 2005. D. D. Lewis, Y. Yang, T. G. Rose, and F. Li. RCV1: A new benchmark collection for text categorization research. Journal of Machine Learning Research, 5:361{ 397, 2004. C.-J. Lin, R. C. Weng, and S. S. Keerthi. Trust region Newton method for largescale logistic regression. Journal of Machine Learning Research, 9:627{650, 2008. URL http://www.csie.ntu.edu.tw/~cjlin/papers/logistic.pdf. Z.-Q. Luo and P. Tseng. On the convergence of coordinate descent method for convex di erentiable minimization. Journal of Optimization Theory and Applications, 72(1):7{35, 1992. O. L. Mangasarian. A nite Newton method for classi cation. Optimization Methods and Software, 17(5):913{929, 2002. G. R�atsch, S. Mika, and M. K. Warmuth. On the convergence of leveraging. In T. G. Dietterich, S. Becker, and Z. Ghahramani, editors, Advances in Neural Information Processing Systems 14, pages 487{494. MIT Press, Cambridge, MA, 2002. N. N. Schraudolph. A fast, compact approximation of the exponential function. Neural Computation, 11:853{862, 1999. S. Shalev-Shwartz, Y. Singer, and N. Srebro. Pegasos: primal estimated subgradient solver for SVM. In Proceedings of the 24th International Conference on Machine Learning (ICML), 2007. A. J. Smola, S. V. N. Vishwanathan, and Q. Le. Bundle methods for machine learning. In J. Platt, D. Koller, Y. Singer, and S. Roweis, editors, Advances in Neural Information Processing Systems 20. MIT Press, Cambridge, MA, 2008. T. Zhang. Solving large scale linear prediction problems using stochastic gradient descent algorithms. In Proceedings of the 21th International Conference on Machine Learning (ICML), 2004. T. Zhang and F. J. Oles. Text categorization based on regularized linear classi fication methods. Information Retrieval, 4(1):5{31, 2001.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/41305	-
dc.description.abstract	線性支持向量機(SVM)是分類大規模資料時很有用的方法。在文件分類和自然語言處理的問題中，特徵向量常常是稀疏的。在這篇論文中，我們提出一個新的座標下降法來求解二次漏失函數的線性支持向量機。我們提出的方法在每一步過程中固定其他變數，只針對某個變數做最小化。而針對這個變數最小化的過程是用牛頓法配上線性搜尋的技巧。我們的演算法會以線性的速度收斂到函數的最小值。因為在最佳化每個變數時，我們的演算法必須找到擁有某個特徵值得所有資料，所以比較適合處理能方便的取得這種資訊的訓練資料。實驗結果顯示出我們的方法比其他目前最新的方法例如Pegasos和Tron還快且穩定。	zh_TW
dc.description.abstract	Linear support vector machines (SVM) are useful for classifying large-scale sparse data. Problems with sparse features are common in applications such as document classi cation and natural language processing. In this thesis, we propose a novel coordinate descent algorithm for training linear SVM with the L2-loss function. At each step, the proposed method minimizes a one-variable sub-problem while fixing other variables. The sub-problem is solved by Newton steps with the line search technique. The procedure globally converges at the linear rate. As each sub-problem involves only values of a corresponding feature, the proposed approach is suitable when accessing a feature is more convenient than accessing an instance. Experiments show that our method is more e cient and stable than state of the art methods such as Pegasos and TRON.	en
dc.description.provenance	Made available in DSpace on 2021-06-15T00:15:32Z (GMT). No. of bitstreams: 1 ntu-98-R96922048-1.pdf: 2461290 bytes, checksum: dc14884762f6c37a980d4ac56ff5e798 (MD5) Previous issue date: 2009	en
dc.description.tableofcontents	TABLE OF CONTENTS 口試委員審定書 : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : i 中文摘要 : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : ii ABSTRACT : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : iii LIST OF FIGURES : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : vi LIST OF TABLES : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : vii CHAPTER I. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 II. Solving Linear SVM via Coordinate Descent . . . . . . . . . . . 6 III. Implementation Issues . . . . . . . . . . . . . . . . . . . . . . . . . 12 3.1 Data Representation . . . . . . . . . . . . . . . . . . . . . . . . 12 3.2 Random Permutation of Sub-problems . . . . . . . . . . . . . . 13 3.3 An Online Algorithm . . . . . . . . . . . . . . . . . . . . . . . 13 IV. Related Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 4.1 Pegasos for L1-SVM . . . . . . . . . . . . . . . . . . . . . . . . 15 4.2 Trust Region Newton Method (TRON) for L2-SVM . . . . . . . 18 4.3 CMLS: A Coordinate Descent Method for L2-SVM . . . . . . . 20 V. Experiments and Analysis . . . . . . . . . . . . . . . . . . . . . . . 22 5.1 Data Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 5.2 Comparisons . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 5.3 Stopping Conditions . . . . . . . . . . . . . . . . . . . . . . . . 30 VI. Discussion and Conclusions . . . . . . . . . . . . . . . . . . . . . . 32 iv 6.1 Order of Sub-problems at Each Outer Iteration . . . . . . . . . 32 6.2 Coordinate Descents for Logistic Regression . . . . . . . . . . . 33 6.3 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 APPENDICES : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 36 BIBLIOGRAPHY : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 46
dc.language.iso	en
dc.title	座標下降法求解大規模二次漏失函數線性支持向量機	zh_TW
dc.title	Coordinate Descent Method for Large-scale L2-loss Linear Support Vector Machines	en
dc.type	Thesis
dc.date.schoolyear	97-2
dc.description.degree	碩士
dc.contributor.oralexamcommittee	林軒田(Hsuan-Tien Lin),李育杰(Yuh-Jye Lee)
dc.subject.keyword	線性支持向量機,文件分類,座標下降法,	zh_TW
dc.subject.keyword	Linear support vector machine,Document classification,Coordinate descent,	en
dc.relation.page	47
dc.rights.note	有償授權
dc.date.accepted	2009-06-16
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	資訊工程學研究所	zh_TW
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
ntu-98-1.pdf 目前未授權公開取用	2.4 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。