基於兩變量區塊座標下降法解大規模線性支持向量機

Chi-Cheng Chiu; 丘濟政

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/71243

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	林智仁(Chih-Jen Lin)
dc.contributor.author	Chi-Cheng Chiu	en
dc.contributor.author	丘濟政	zh_TW
dc.date.accessioned	2021-06-17T05:00:35Z	-
dc.date.available	2021-08-01
dc.date.copyright	2018-08-01
dc.date.issued	2018
dc.date.submitted	2018-07-25
dc.identifier.citation	B. E. Boser, I. Guyon, and V. Vapnik. A training algorithm for optimal margin classifiers. In Proceedings of the Fifth Annual Workshop on Computational Learning Theory, pages 144–152. ACM Press, 1992. C.-C. Chang and C.-J. Lin. LIBSVM: a library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2(3):27:1–27:27, 2011. Software available at http://www.csie.ntu.edu.tw/˜cjlin/libsvm. W.-L. Chiang, M.-C. Lee, and C.-J. Lin. Parallel dual coordinate descent method for large-scale linear classification in multi-core environments. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), 2016. URL http://www.csie.ntu.edu.tw/˜cjlin/papers/multicore_cddual.pdf. C.-C. Chiu and C.-J. Lin. Two-variable block dual coordinate descent methods for large-scale linear support vector machines. Technical report, National Taiwan University, 2018. URL https://www.csie.ntu.edu.tw/˜cjlin/papers/2var_cd/twocddual.pdf. C. Cortes and V. Vapnik. Support-vector network. Machine Learning, 20:273–297, 1995. D. Csiba, Z. Qu, and P. Richt´arik. Stochastic dual coordinate ascent with adaptive probabilities. In Proceedings of the 32nd International Conference on Machine Learning, pages 674–683, 2015. R.-E. Fan, P.-H. Chen, and C.-J. Lin. Working set selection using second order information for training SVM. Journal of Machine Learning Research, 6: 1889–1918, 2005. URL http://www.csie.ntu.edu.tw/˜cjlin/papers/quadworkset.pdf. R.-E. Fan, K.-W. Chang, C.-J. Hsieh, X.-R. Wang, and C.-J. Lin. LIBLINEAR: a library for large linear classification. Journal of Machine Learning Research, 9: 1871–1874, 2008. URL http://www.csie.ntu.edu.tw/˜cjlin/papers/liblinear.pdf. T.-T. Friess, N. Cristianini, and C. Campbell. The kernel adatron algorithm: a fast and simple learning procedure for support vector machines. In Proceedings of 15th Intl. Conf. Machine Learning. Morgan Kaufman Publishers, 1998. T. Glasmachers and U. Dogan. Accelerated coordinate descent with adaptive coordinate frequencies. In Proceedings of the 5th Asian Conference on Machine Learning, volume 29 of Proceedings of Machine Learning Research, pages 72–86, 2013. C. Hildreth. A quadratic programming procedure. Naval Research Logistics Quarterly,4:79–85, 1957. C.-J. Hsieh, K.-W. Chang, C.-J. Lin, S. S. Keerthi, and S. Sundararajan. A dual coordinate descent method for large-scale linear SVM. In Proceedings of the Twenty Fifth International Conference on Machine Learning (ICML), 2008. URL http://www.csie.ntu.edu.tw/˜cjlin/papers/cddual.pdf. T. Joachims. Making large-scale SVM learning practical. In B. Sch¨olkopf, C. J. C. Burges, and A. J. Smola, editors, Advances in Kernel Methods – Support Vector Learning, pages 169–184, Cambridge, MA, 1998. MIT Press. O. L. Mangasarian and D. R. Musicant. Successive overrelaxation for support vector machines. IEEE Transactions on Neural Networks, 10(5):1032–1037, 1999. J. C. Platt. Fast training of support vector machines using sequential minimal optimization. In B. Sch¨olkopf, C. J. C. Burges, and A. J. Smola, editors, Advances in Kernel Methods - Support Vector Learning, Cambridge, MA, 1998. MIT Press. B. Sch¨olkopf, J. C. Platt, J. Shawe-Taylor, A. J. Smola, and R. C. Williamson. Estimating the support of a high-dimensional distribution. Neural Computation, 13(7): 1443–1471, 2001. S. Shalev-Shwartz and T. Zhang. Stochastic dual coordinate ascent methods for regularized loss minimization. Journal of Machine Learning Research, 14:567–599, 2013. I. Steinwart, D. Hush, and C. Scovel. Training SVMs without offset. Journal of Machine Learning Research, 12:141–202, 2011. D. M. J. Tax and R. P. W. Duin. Support vector data description. Machine Learning, 54(1):45–66, 2004. P. Tseng and S. Yun. A coordinate gradient descent method for nonsmooth separable minimization. Mathematical Programming, 117:387–423, 2009.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/71243	-
dc.description.abstract	座標下降法已經是解大規模線性支持向量機的最新技術，最常是用在解無偏差項支持向量機的對偶問題，對於更新過程中一次更新一個變量的座標下降法是可以輕易的實作出來。在這篇論文中我們將從一個變量拓展至兩個，看似簡單實則不然，必須有一些複雜的推導才能得到單純的步驟，我們的演算法在一般情況下可匹敵且困難問題中有良好的表現。我們進一步討論雙變量座標下降法解有偏差項支持向量機，研究後發現此法再有偏差項上表現較差，與核方法支持向量機結果不同，因此過去單變量坐標下降法的成功並非偶然，像是選擇非正規的無偏差支持向量機使得計算上更為效率。綜觀所述本篇在線性支持向量機提供許多新的角度。	zh_TW
dc.description.abstract	Coordinate descent (CD) methods have been a state-of-the-art technique for training large-scale linear SVM. The most used setting is to solve the dual problem of an SVM formulation without the bias term, for which the CD procedure of updating one variable at a time is very simple and easy to implement. In this thesis, we extend the one-variable setting to use two variables at each CD step. The extension, while looks simple, is not trivial. Some complicated derivations are needed to get a simple CD procedure. Our resulting algorithm is generally competitive with one-variable CD and is superior for difficult problems. We further discuss the two-variable CD for the standard SVM formulation with a bias term. The analysis shows that CD methods are less effective for this SVM formulation, a situation very different from that of kernel SVM. Thus the success of simple one-variable CD in the past decade is not a coincidence. Some design choices such as the SVM formulation considered help to make it computationally efficient. Overall this thesis sheds many new insights on CD methods for training linear SVM.	en
dc.description.provenance	Made available in DSpace on 2021-06-17T05:00:35Z (GMT). No. of bitstreams: 1 ntu-107-R03922089-1.pdf: 5351145 bytes, checksum: 826e2ad46b8f6e1d4b58553214ae6abc (MD5) Previous issue date: 2018	en
dc.description.tableofcontents	口試委員會審定書: : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : i 中文摘要: : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : ii ABSTRACT : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : iii LIST OF FIGURES : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : vi LIST OF TABLES : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : vii CHAPTER I. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 II. Coordinate Descent Method for SVM Dual Problems . . . . . . . . . . 4 2.1 Dual SVM Optimization Problem . . . . . . . . . . . . . . . . . . 4 2.2 Block Coordinate Descent Methods for Linear SVM . . . . . . . . 6 2.3 Gradient Calculation for Linear and Kernel SVM . . . . . . . . . . 8 2.4 Working Set Selection for Block CD Methods . . . . . . . . . . . 9 III. Two-variable Block Coordinate Descent Method . . . . . . . . . . . . . 12 3.1 The Framework of the Algorithm . . . . . . . . . . . . . . . . . . 12 3.2 Solving Two-variable Sub-problems . . . . . . . . . . . . . . . . 13 3.2.1 Hessian is Positive Definite . . . . . . . . . . . . . . . . 14 3.2.2 Hessian is Positive Semi-definite . . . . . . . . . . . . . 21 3.3 Shrinking Technique . . . . . . . . . . . . . . . . . . . . . . . . . 25 3.4 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 3.4.1 The Importance of Randomness in Working-set Selection 28 3.4.2 Comparison with One-variable CD . . . . . . . . . . . 29 3.4.3 Effect of Shrinking Techniques . . . . . . . . . . . . . . 30 IV. Two-variable Block CD for Linear SVM with a Bias Term . . . . . . . 34 4.1 Solving the Two-variable Sub-problem . . . . . . . . . . . . . . . 34 4.2 Difference Between With and Without the Bias Term . . . . . . . 35 4.3 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 V. Discussion and Conclusions . . . . . . . . . . . . . . . . . . . . . . . . 41 APPENDICES : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 43 BIBLIOGRAPHY : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 54
dc.language.iso	en
dc.title	基於兩變量區塊座標下降法解大規模線性支持向量機	zh_TW
dc.title	Two-variable Block Dual Coordinate Descent Methods for Large-scale Linear Support Vector Machines	en
dc.type	Thesis
dc.date.schoolyear	106-2
dc.description.degree	碩士
dc.contributor.oralexamcommittee	林軒田(Hsuan-Tien Lin),李育杰(Yuh-Jye Lee)
dc.subject.keyword	區塊座標下降法,線性支持向量機,	zh_TW
dc.subject.keyword	block coordinate descent,linear support vector machines,	en
dc.relation.page	55
dc.identifier.doi	10.6342/NTU201801837
dc.rights.note	有償授權
dc.date.accepted	2018-07-26
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	資訊工程學研究所	zh_TW
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
ntu-107-1.pdf 目前未授權公開取用	5.23 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。