請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/41305
完整後設資料紀錄
DC 欄位 | 值 | 語言 |
---|---|---|
dc.contributor.advisor | 林智仁(Chih-Jen Lin) | |
dc.contributor.author | Cho-Jui Hsieh | en |
dc.contributor.author | 謝卓叡 | zh_TW |
dc.date.accessioned | 2021-06-15T00:15:32Z | - |
dc.date.available | 2009-07-16 | |
dc.date.copyright | 2009-07-16 | |
dc.date.issued | 2009 | |
dc.date.submitted | 2009-06-16 | |
dc.identifier.citation | D. P. Bertsekas. Nonlinear Programming. Athena Scienti c, Belmont, MA 02178-9998, second edition, 1999.
B. E. Boser, I. Guyon, and V. Vapnik. A training algorithm for optimal margin classi ers. In Proceedings of the Fifth Annual Workshop on Computational Learning Theory, pages 144{152. ACM Press, 1992. L. Bottou. Stochastic gradient descent examples, 2007. http://leon.bottou.org/projects/sgd. L. Bottou and O. Bousquet. The tradeo s of large scale learning. In J. Platt, D. Koller, Y. Singer, and S. Roweis, editors, Advances in Neural Information Processing Systems 20. MIT Press, Cambridge, MA, 2008. K.-W. Chang, C.-J. Hsieh, and C.-J. Lin. Coordinate descent method for largescale L2-loss linear SVM. Journal of Machine Learning Research, 9:1369{1398, 2008. URL http://www.csie.ntu.edu.tw/~cjlin/papers/cdl2.pdf. M. Dud k, S. J. Phillips, and R. E. Schapire. Performance guarantees for regularized maximum entropy density estimation. In Proceedings of the 17th Annual Conference on Computational Learning Theory, pages 655{662, New York, 2004. ACM press. I. S. Du , R. G. Grimes, and J. G. Lewis. Sparse matrix test problems. ACM Transactions on Mathematical Software, 15:1{14, 1989. L. Grippo and M. Sciandrone. Globally convergent block-coordinate techniques for unconstrained optimization. Optimization Methods and Software, 10:587{637, 1999. C.-J. Hsieh, K.-W. Chang, C.-J. Lin, S. S. Keerthi, and S. Sundararajan. A dual coordinate descent method for large-scale linear SVM. In Proceedings of the Twenty Fifth International Conference on Machine Learning (ICML), 2008. URL http://www.csie.ntu.edu.tw/~cjlin/papers/cddual.pdf. Software available at http://www.csie.ntu.edu.tw/~cjlin/liblinear. T. Joachims. Training linear SVMs in linear time. In Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD). ACM, 2006. S. S. Keerthi and D. DeCoste. A modi ed nite Newton method for fast solution of large scale linear SVMs. Journal of Machine Learning Research, 6:341{361, 2005. D. D. Lewis, Y. Yang, T. G. Rose, and F. Li. RCV1: A new benchmark collection for text categorization research. Journal of Machine Learning Research, 5:361{ 397, 2004. C.-J. Lin, R. C. Weng, and S. S. Keerthi. Trust region Newton method for largescale logistic regression. Journal of Machine Learning Research, 9:627{650, 2008. URL http://www.csie.ntu.edu.tw/~cjlin/papers/logistic.pdf. Z.-Q. Luo and P. Tseng. On the convergence of coordinate descent method for convex di erentiable minimization. Journal of Optimization Theory and Applications, 72(1):7{35, 1992. O. L. Mangasarian. A nite Newton method for classi cation. Optimization Methods and Software, 17(5):913{929, 2002. G. R�atsch, S. Mika, and M. K. Warmuth. On the convergence of leveraging. In T. G. Dietterich, S. Becker, and Z. Ghahramani, editors, Advances in Neural Information Processing Systems 14, pages 487{494. MIT Press, Cambridge, MA, 2002. N. N. Schraudolph. A fast, compact approximation of the exponential function. Neural Computation, 11:853{862, 1999. S. Shalev-Shwartz, Y. Singer, and N. Srebro. Pegasos: primal estimated subgradient solver for SVM. In Proceedings of the 24th International Conference on Machine Learning (ICML), 2007. A. J. Smola, S. V. N. Vishwanathan, and Q. Le. Bundle methods for machine learning. In J. Platt, D. Koller, Y. Singer, and S. Roweis, editors, Advances in Neural Information Processing Systems 20. MIT Press, Cambridge, MA, 2008. T. Zhang. Solving large scale linear prediction problems using stochastic gradient descent algorithms. In Proceedings of the 21th International Conference on Machine Learning (ICML), 2004. T. Zhang and F. J. Oles. Text categorization based on regularized linear classi fication methods. Information Retrieval, 4(1):5{31, 2001. | |
dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/41305 | - |
dc.description.abstract | 線性支持向量機(SVM)是分類大規模資料時很有用的方法。在文件分類和自然語言處理的問題中,特徵向量常常是稀疏的。在這篇論文中,我們提出一個新的座標下降法來求解二次漏失函數的線性支持向量機。我們提出的方法在每一步過程中固定其他變數,只針對某個變數做最小化。而針對這個變數最小化的過程是用牛頓法配上線性搜尋的技巧。我們的演算法會以線性的速度收斂到函數的最小值。因為在最佳化每個變數時,我們的演算法必須找到擁有某個特徵值得所有資料,所以比較適合處理能方便的取得這種資訊的訓練資料。實驗結果顯示出我們的方法比其他目前最新的方法例如Pegasos和Tron還快且穩定。 | zh_TW |
dc.description.abstract | Linear support vector machines (SVM) are useful for classifying large-scale sparse data. Problems with sparse features are common in applications such as document classi cation and natural language processing. In this thesis, we propose a novel coordinate descent algorithm for training linear SVM with the L2-loss function. At each step, the proposed method minimizes a one-variable sub-problem while fixing other variables. The sub-problem is solved by Newton steps with the line search technique. The procedure globally converges at the linear rate. As each sub-problem involves only values of a corresponding feature, the proposed approach is suitable when accessing a feature is more convenient than accessing an instance. Experiments show that our method is more e cient and stable than state of the art methods such as Pegasos and TRON. | en |
dc.description.provenance | Made available in DSpace on 2021-06-15T00:15:32Z (GMT). No. of bitstreams: 1 ntu-98-R96922048-1.pdf: 2461290 bytes, checksum: dc14884762f6c37a980d4ac56ff5e798 (MD5) Previous issue date: 2009 | en |
dc.description.tableofcontents | TABLE OF CONTENTS
口試委員審定書 : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : i 中文摘要 : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : ii ABSTRACT : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : iii LIST OF FIGURES : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : vi LIST OF TABLES : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : vii CHAPTER I. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 II. Solving Linear SVM via Coordinate Descent . . . . . . . . . . . 6 III. Implementation Issues . . . . . . . . . . . . . . . . . . . . . . . . . 12 3.1 Data Representation . . . . . . . . . . . . . . . . . . . . . . . . 12 3.2 Random Permutation of Sub-problems . . . . . . . . . . . . . . 13 3.3 An Online Algorithm . . . . . . . . . . . . . . . . . . . . . . . 13 IV. Related Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 4.1 Pegasos for L1-SVM . . . . . . . . . . . . . . . . . . . . . . . . 15 4.2 Trust Region Newton Method (TRON) for L2-SVM . . . . . . . 18 4.3 CMLS: A Coordinate Descent Method for L2-SVM . . . . . . . 20 V. Experiments and Analysis . . . . . . . . . . . . . . . . . . . . . . . 22 5.1 Data Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 5.2 Comparisons . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 5.3 Stopping Conditions . . . . . . . . . . . . . . . . . . . . . . . . 30 VI. Discussion and Conclusions . . . . . . . . . . . . . . . . . . . . . . 32 iv 6.1 Order of Sub-problems at Each Outer Iteration . . . . . . . . . 32 6.2 Coordinate Descents for Logistic Regression . . . . . . . . . . . 33 6.3 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 APPENDICES : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 36 BIBLIOGRAPHY : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 46 | |
dc.language.iso | en | |
dc.title | 座標下降法求解大規模二次漏失函數線性支持向量機 | zh_TW |
dc.title | Coordinate Descent Method for Large-scale L2-loss Linear Support Vector Machines | en |
dc.type | Thesis | |
dc.date.schoolyear | 97-2 | |
dc.description.degree | 碩士 | |
dc.contributor.oralexamcommittee | 林軒田(Hsuan-Tien Lin),李育杰(Yuh-Jye Lee) | |
dc.subject.keyword | 線性支持向量機,文件分類,座標下降法, | zh_TW |
dc.subject.keyword | Linear support vector machine,Document classification,Coordinate descent, | en |
dc.relation.page | 47 | |
dc.rights.note | 有償授權 | |
dc.date.accepted | 2009-06-16 | |
dc.contributor.author-college | 電機資訊學院 | zh_TW |
dc.contributor.author-dept | 資訊工程學研究所 | zh_TW |
顯示於系所單位: | 資訊工程學系 |
文件中的檔案:
檔案 | 大小 | 格式 | |
---|---|---|---|
ntu-98-1.pdf 目前未授權公開取用 | 2.4 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。