最佳化方法於原始與對偶最大熵問題應用

Fang-Lan Huang; 黃芳蘭

Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/48767

Title:	最佳化方法於原始與對偶最大熵問題應用 Optimization Methods for Primal and Dual Maximum Entropy Problems
Authors:	Fang-Lan Huang 黃芳蘭
Advisor:	林智仁(Chih-Jen Lin)
Keyword:	最大熵模型,邏輯迴歸模型,迭代法,座標下降法,最佳化方法,線性分類模型, maximum entropy,logistic regression,iterative scaling,coordinate descent,natural language processing,optimization,linear classification,
Publication Year :	2010
Degree:	博士
Abstract:	最大熵模型在自然語言處理與其他領域中相當有用，本論文致力於研究有效率的最佳化方法來解最大熵模型原始與對偶的問題，同時，也探討了最大熵模型的特殊形式─邏輯迴歸。本論文分為兩個部份：第一部分：研究迭代法解最大熵模型的原始問題，此法是解最大熵問題最熱門的方法之一。這麼多不同的迭代法中，要了解並看出不同是有困難的，因此，我們建立一個普遍與統一的迭代法架構，此架構連結了迭代與座標下降法。同時，我們也證明了一般性的收斂結果並分析時間複雜度。根據這架構，我們擴充一個在線性支撐向量機應用的座標下降法來解最大熵問題，結果顯示此法比現存的迭代法快。第二部份：研究最佳化的方法來解最大熵模型對偶的問題，目前現存的方法大多為解原始的問題，而非對偶的問題。反之，對線性支撐向量機模型，已有些有效率的方法提出來解對偶的問題。我們應用座標下降法來解對偶的邏輯迴歸與最大熵模型，有趣地，我們發現很多細節與支撐向量機模型的情況不同，並仔細地研究理論收斂與數值問題。這個新提出的對偶方法比大多數目前最新的方法解邏輯迴歸與最大熵模型還快。 Maximum entropy is useful in natural language processing and many other areas. In this thesis we aim to study efficient optimization methods for training maximum entropy problems in both primal and dual forms. We also investigate a special case of maximum entropy called logistic regression. This thesis can be separated to the following two parts. In the first part, we study iterative scaling methods for solving the primal problem of maximum entropy. Currently, iterative scaling methods are one of the most popular approaches to solve maximum entropy. With many variants of iterative scaling methods, it is difficult to understand them and see the differences. We create a general and unified framework for iterative scaling methods. This framework also connects iterative scaling and coordinate descent methods. We prove general convergence results and analyze the computational complexity. Based on the proposed framework, we extend a coordinate descent method for linear support vector machines (SVM) to the primal problem of maximum entropy. Results show that it is faster than existing iterative scaling methods. In the second part, we study optimization methods for solving the dual problem of maximum entropy. Most existing training methods solve the primal problem instead of the dual. In contrast, for SVM, methods have been shown to be very effective for solving the dual problem. We apply coordinate descent methods to solve the dual form of logistic regression and maximum entropy. Interestingly, we show that many details are different from the situation in SVM. We carefully study the theoretical convergence as well as numerical issues. The proposed dual methods are shown to be faster than most state of the art methods for training logistic regression and maximum entropy.
URI:	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/48767
Fulltext Rights:	有償授權
Appears in Collections:	資訊工程學系

Files in This Item:

File	Size	Format
ntu-99-1.pdf Restricted Access	1.74 MB	Adobe PDF

Show full item record

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets