請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/45612
標題: | 用迴歸分析處理成本導向多重分類問題 Regression approaches for multi-class cost-sensitive classification |
作者: | Han-Hsing Tu 涂漢興 |
指導教授: | 林軒田(Hsuan-Tien Lin) |
關鍵字: | 成本導向多重分類,成本資訊,迴歸分析,支持向量機, multi-class cost-sensitive classification,cost information,regression,support vector machines, |
出版年 : | 2009 |
學位: | 碩士 |
摘要: | 成本導向多重分類問題在近年來越來越為重要,在圖形辨識和醫學
研究等問題上有很高的應用價值。為了使做出的分類決策可以達到最 低成本,陸續有許多研究者提出了加入成本資訊的機器學習演算法。 這些演算法中最常見的步驟,是將成本資訊轉化為每筆資料的比重。 在本篇論文中,我們採取了一項不同的步驟:利用迴歸分析來預估每 筆資料相對應的分類成本,並依最低的預估成本來做分類決策。此方 法簡單並可以和各式的迴歸分析演算法結合,並有很強的理論基礎支 持。我們更進一步的分析了前述方法的盲點,利用創新的迴歸損失函 數,配合支持向量機,來設計更強而有力的成本導向多重分類演算 法,並透過實驗展示了本演算法的優越性。 Cost-sensitive classification is an important research problem in recent years. It allows machine learning algorithms to use the additional cost information to make more strategic decisions. Studies on binary cost-sensitive classification have led to promising results in theories, algorithms, and applications. The multi-class counterpart is also needed in many real-world applications, but is more difficult to analyze. This thesis focuses on multi-class cost-sensitive classification. Existing methods for multi-class cost-sensitive classification usually transform the cost information into example importance (weight). This thesis offers a different viewpoint of the problem, and proposes a novel method. We directly estimate the cost value corresponding to each prediction using regression, and outputs the label that comes with the smallest estimated cost. We improve the method by analyzing the errors made during the decision. Then, we propose a different regression loss function that tightly connects with the errors. The new loss function leads to a solid theoretical guarantee of error transformation. We design a concrete algorithm for the loss function with the support vector machines. The algorithm can be viewed as a theoretically justified extension the popular one-versus-all support vector machine. Experiments using real-world data sets with arbitrary cost values demonstrate the usefulness of our proposed methods, and validate that the cost information should be appropriately used instead of dropped. |
URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/45612 |
全文授權: | 有償授權 |
顯示於系所單位: | 資訊工程學系 |
文件中的檔案:
檔案 | 大小 | 格式 | |
---|---|---|---|
ntu-98-1.pdf 目前未授權公開取用 | 934.46 kB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。