Skip navigation

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets

Learn More
DSpace logo
English
中文
  • Browse
    • Communities
      & Collections
    • Publication Year
    • Author
    • Title
    • Subject
    • Advisor
  • Search TDR
  • Rights Q&A
    • My Page
    • Receive email
      updates
    • Edit Profile
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊工程學系
Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/48980
Title: 以循環分類鍊處理多標籤分類問題
Cyclic Classifier Chain for Multilabel Classification
Authors: Yi-An Lin
林奕安
Advisor: 林軒田(Hsuan-Tien Lin)
Keyword: 機器學習,多標籤分類,成本導向學習,
Machine Learning,Multilabel Classification,Cost-sensitive Learning,
Publication Year : 2016
Degree: 碩士
Abstract: 本論文提出循環分類鍊(Cyclic Classifier Chain),用以有效率地解決一個真實世界常見的問題--多標籤分類問題。循環分類鍊的想法是循環地訓練多個分類鍊(Classifier Chain)。這樣做可以降低標籤順序的影響,因為原本的分類鍊在訓練時,只能拿前面已經訓練完的標籤當作特徵加入訓練,而循環分類鍊可以讓分類器取得所有標籤的估計值。我們的實驗證明循環分類鍊在訓練過幾個循環後,就可以有效降低不同標籤順序對結果的影響,且結果比分類鍊好。受到壓縮篩選樹演算法(Condensed Filter Tree)的啟發,我們也把循環分類鍊擴展來通用地解決成本導向多標籤分類問題(cost-sensitive multiclass classification)。不同的問題有不同的成本計算方式是非常常見的,因此成本導向多標籤分類問題是個重要的問題。因為分類器在訓練時可以知道其他標籤,所以我們可以計算一個標籤被分類成各種可能的分類時的成本,再將這個成本嵌入在樣本的比重裡。
這個方法可以最小化任何基於單一樣本的成本,而我們的實驗也證明這樣做能比另一個成本導向的方法,機率分類鍊(Probabilistic Classifier Chain)還要好。而與壓縮篩選樹也不相上下,甚至略贏。為了讓預測更加穩定,我們還提出梯度增強(Gradient Boosting)版本的循環分類鍊。這方法比循環分類鍊好一點,也可以被有效率地訓練。我們也試著使用回歸樹(Regression Tree)當作基礎學習者,提出一個非線性的方法來解決通用成本導向多標籤分類問題。而壓縮篩選樹因為訓練的時間複雜度太高,所以較難用非線性方法來訓練。
We propose a novel method, Cyclic Classifier Chain (CCC), for multilabel classification. CCC extends the classic Classifier Chain (CC) method by cyclically training multiple chains of labels. Three benefits immediately follow the cyclic design. First, CCC resolves the critical issue of label ordering in CC, and therefore reaches more stable performance. Second, CCC matches the task of cost-sensitive multilabel classification, an important problem for satisfying application needs. The cyclic aspect of CCC allows estimating all labels during training, and such estimates makes it possible to embed the cost information into weights of labels. Experimental results justify that cost-sensitive CCC can be superior to state-of-the-art cost-sensitive multilabel classification methods. Third, CCC can be easily coupled with gradient boosting to inherit the advantages of ensemble learning. In particular, gradient boosted CCC efficiently reaches promising performance for both linear and non-linear base learners. The three benefits, stability, cost-sensitivity and efficiency make CCC a competitive method for real-world applications.
URI: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/48980
DOI: 10.6342/NTU201603167
Fulltext Rights: 有償授權
Appears in Collections:資訊工程學系

Files in This Item:
File SizeFormat 
ntu-105-1.pdf
  Restricted Access
746.98 kBAdobe PDF
Show full item record


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved