Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊工程學系
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/48980
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor林軒田(Hsuan-Tien Lin)
dc.contributor.authorYi-An Linen
dc.contributor.author林奕安zh_TW
dc.date.accessioned2021-06-15T11:12:54Z-
dc.date.available2016-08-25
dc.date.copyright2016-08-25
dc.date.issued2016
dc.date.submitted2016-08-21
dc.identifier.citation[1] Robert E. Schapire and Yoram Singer. Boostexter: A boosting-based system for text categorization. Machine Learning, 39(2/3):135–168, 2000.
[2] Naonori Ueda and Kazumi Saito. Parametric mixture models for multi-labeled text. In Advances in Neural Information Processing Systems 15 [Neural Information Processing Systems, NIPS 2002, December 9-14, 2002, Vancouver, British Columbia, Canada], pages 721–728, 2002.
[3] Forrest Briggs, Yonghong Huang, Raviv Raich, Konstantinos Eftaxias, Zhong Lei, William Cukierski, Sarah Frey Hadley, Adam Hadley, Matthew Betts, Xiaoli Z. Fern, Jed Irvine, Lawrence Neal, Anil Thomas, Gábor Fodor, Grigorios Tsoumakas, Hong Wei Ng, Thi Ngoc Tho Nguyen, Heikki Huttunen, Pekka Ruusuvuori, Tapio Manninen, Aleksandr Diment, Tuomas Virtanen, Julien Marzat, Joseph Defretin, Dave Callender, Chris Hurlburt, Ken Larrey, and Maxim Milakov. The 9th annual MLSP competition: New methods for acoustic classification of multiple simultaneous bird species in a noisy environment. In IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2013, Southampton, United Kingdom, September 22-25, 2013, pages 1–8, 2013.
[4] Matthew R. Boutell, Jiebo Luo, Xipeng Shen, and Christopher M. Brown. Learning multi-label scene classification. Pattern Recognition, 37(9):1757–1771, 2004.
[5] André Elisseeff and Jason Weston. A kernel method for multi-labelled classification. In Advances in Neural Information Processing Systems 14 [Neural Information Processing Systems: Natural and Synthetic, NIPS 2001, December 3-8, 2001, Vancouver, British Columbia, Canada], pages 681–687, 2001.
[6] Zafer Barutçuoglu, Robert E. Schapire, and Olga G. Troyanskaya. Hierarchical multi-label prediction of gene function. Bioinformatics, 22(7):830–836, 2006.
[7] Min-Ling Zhang and Zhi-Hua Zhou. ML-KNN: A lazy learning approach to multilabel learning. Pattern Recognition, 40(7):2038–2048, 2007.
[8] Yao-Nan Chen and Hsuan-Tien Lin. Feature-aware label space dimension reduction for multi-label classification. In Advances in Neural Information Processing Systems: Proceedings of the 2012 Conference (NIPS), pages 1538–1546, December 2012.
[9] Min ling Zhang, Zhi hua Zhou, and Senior Member. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING 1 multi-label neural networks with applications to functional genomics and text categorization, December 30 2008.
[10] Grigorios Tsoumakas, Ioannis Katakis, and Ioannis Vlahavas. Mining Multi-label Data. In Oded Maimon and Lior Rokach, editors, Data Mining and Knowledge Discovery Handbook, chapter 34, pages 667–685. Springer US, 2010.
[11] Grigorios Tsoumakas and Ioannis P. Vlahavas. Random k -labelsets: An ensemble method for multilabel classification. In Machine Learning: ECML 2007, 18th European Conference on Machine Learning, Warsaw, Poland, September 17-21, 2007, Proceedings, pages 406–417, 2007.
[12] Jesse Read, Bernhard Pfahringer, Geoffrey Holmes, and Eibe Frank. Classifier chains for multi-label classification. In Machine Learning and Knowledge Discovery in Databases, European Conference, ECML PKDD 2009, Bled, Slovenia, September 7-11, 2009, Proceedings, Part II, pages 254–269, 2009.
[13] Krzysztof Dembczynski, Weiwei Cheng, and Eyke Hüllermeier. Bayes optimal multilabel classification via probabilistic classifier chains. In Proceedings of the 27th International Conference on Machine Learning (ICML-10), June 21-24, 2010, Haifa, Israel, pages 279–286, 2010.
[14] Weiwei Liu and Ivor Tsang. On the optimality of classifier chain for multi-label classification. In C. Cortes, N.D. Lawrence, D.D. Lee, M. Sugiyama, R. Garnett, and R. Garnett, editors, Advances in Neural Information Processing Systems 28, pages 712–720. Curran Associates, Inc., 2015.
[15] Chun-Liang Li and Hsuan-Tien Lin. Condensed filter tree for cost-sensitive multilabel classification. In Proceedings of the International Conference on Machine Learning (ICML), pages 423–431, June 2014.
[16] Yashoteja Prabhu and Manik Varma. Fastxml: A fast, accurate and stable treeclassifier for extreme multi-label learning. ACM–Association for Computing Machinery, August 2014.
[17] Krzysztof Dembczynski, Wojciech Kotlowski, and Eyke Hüllermeier. Consistent multilabel ranking through univariate losses. In Proceedings of the 29th International Conference on Machine Learning, ICML 2012, Edinburgh, Scotland, UK, June 26 - July 1, 2012, 2012.
[18] Krzysztof J. Dembczynski, Willem Waegeman, Weiwei Cheng, and Eyke Hullermeier. An Exact Algorithm for F-Measure Maximization. In Advances in Neural Information Processing Systems 24, pages 1404–1412. 2011.
[19] Kdd cup 2015 winner report. http://kddcup2015.com/informationwinners. html. Accessed: 2016-01-12.
[20] Loan default prediction winner report. https://www.kaggle.com/c/loandefaultprediction/details/winners. Accessed: 2016-01-12.
[21] Jerome H. Friedman. Greedy function approximation: A gradient boosting machine., October 2001.
[22] Jerome H. Friedman. Stochastic gradient boosting. Comput. Stat. Data Anal., 38(4): 367–378, February 2002.
[23] L. Breiman, R. Olshen J. Friedman, and C. Stone. Classification and Regression Trees. Wadsworth, Belmont, CA, 1984.
[24] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825–2830, 2011.
[25] G. Tsoumakas, E. Spyromitros-Xioufis, J. Vilcek, and I. Vlahavas. Mulan: a java library for multi-label learning. Journal of Machine Learning Research, 2011.
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/48980-
dc.description.abstract本論文提出循環分類鍊(Cyclic Classifier Chain),用以有效率地解決一個真實世界常見的問題--多標籤分類問題。循環分類鍊的想法是循環地訓練多個分類鍊(Classifier Chain)。這樣做可以降低標籤順序的影響,因為原本的分類鍊在訓練時,只能拿前面已經訓練完的標籤當作特徵加入訓練,而循環分類鍊可以讓分類器取得所有標籤的估計值。我們的實驗證明循環分類鍊在訓練過幾個循環後,就可以有效降低不同標籤順序對結果的影響,且結果比分類鍊好。受到壓縮篩選樹演算法(Condensed Filter Tree)的啟發,我們也把循環分類鍊擴展來通用地解決成本導向多標籤分類問題(cost-sensitive multiclass classification)。不同的問題有不同的成本計算方式是非常常見的,因此成本導向多標籤分類問題是個重要的問題。因為分類器在訓練時可以知道其他標籤,所以我們可以計算一個標籤被分類成各種可能的分類時的成本,再將這個成本嵌入在樣本的比重裡。
這個方法可以最小化任何基於單一樣本的成本,而我們的實驗也證明這樣做能比另一個成本導向的方法,機率分類鍊(Probabilistic Classifier Chain)還要好。而與壓縮篩選樹也不相上下,甚至略贏。為了讓預測更加穩定,我們還提出梯度增強(Gradient Boosting)版本的循環分類鍊。這方法比循環分類鍊好一點,也可以被有效率地訓練。我們也試著使用回歸樹(Regression Tree)當作基礎學習者,提出一個非線性的方法來解決通用成本導向多標籤分類問題。而壓縮篩選樹因為訓練的時間複雜度太高,所以較難用非線性方法來訓練。
zh_TW
dc.description.abstractWe propose a novel method, Cyclic Classifier Chain (CCC), for multilabel classification. CCC extends the classic Classifier Chain (CC) method by cyclically training multiple chains of labels. Three benefits immediately follow the cyclic design. First, CCC resolves the critical issue of label ordering in CC, and therefore reaches more stable performance. Second, CCC matches the task of cost-sensitive multilabel classification, an important problem for satisfying application needs. The cyclic aspect of CCC allows estimating all labels during training, and such estimates makes it possible to embed the cost information into weights of labels. Experimental results justify that cost-sensitive CCC can be superior to state-of-the-art cost-sensitive multilabel classification methods. Third, CCC can be easily coupled with gradient boosting to inherit the advantages of ensemble learning. In particular, gradient boosted CCC efficiently reaches promising performance for both linear and non-linear base learners. The three benefits, stability, cost-sensitivity and efficiency make CCC a competitive method for real-world applications.en
dc.description.provenanceMade available in DSpace on 2021-06-15T11:12:54Z (GMT). No. of bitstreams: 1
ntu-105-R02922163-1.pdf: 764905 bytes, checksum: 5c7272c9fa46e48012c6560bace59aff (MD5)
Previous issue date: 2016
en
dc.description.tableofcontents口試委員會審定書 i
中文摘要iii
Abstract v
Contents vii
List of Figures ix
List of Tables xi
1 Introduction 1
2 Classifier Chain 5
2.1 Classifier Chain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Ensemble Classifier Chain . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.3 Probabilistic Classifier Chain . . . . . . . . . . . . . . . . . . . . . . . . 7
2.4 Condensed Filter Tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3 Proposed Method 11
3.1 Cyclic Classifier Chain . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.2 Cost-Sensitive Cyclic Classifier Chain . . . . . . . . . . . . . . . . . . . 13
3.3 Gradient Boosted C4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.4 Gradient Tree Boosted C4 . . . . . . . . . . . . . . . . . . . . . . . . . 18
4 Experiment 19
4.1 Convergence of Cyclic Classifier Chain and the Effect of the Label Order 19
4.2 Experiment Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.3 Comparison between Cost-sensitive and
Cost-insensitive . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
4.4 Comparison among the variations of CCC . . . . . . . . . . . . . . . . . 24
4.5 Comparison with CFT and PCC . . . . . . . . . . . . . . . . . . . . . . 25
5 Conclusions 27
Bibliography 29
dc.language.isoen
dc.subject多標籤分類zh_TW
dc.subject機器學習zh_TW
dc.subject成本導向學習zh_TW
dc.subjectMultilabel Classificationen
dc.subjectMachine Learningen
dc.subjectCost-sensitive Learningen
dc.title以循環分類鍊處理多標籤分類問題zh_TW
dc.titleCyclic Classifier Chain for Multilabel Classificationen
dc.typeThesis
dc.date.schoolyear104-2
dc.description.degree碩士
dc.contributor.oralexamcommittee許永真(Jane Yung-jen Hsu),李育杰(Yuh-Jye Lee)
dc.subject.keyword機器學習,多標籤分類,成本導向學習,zh_TW
dc.subject.keywordMachine Learning,Multilabel Classification,Cost-sensitive Learning,en
dc.relation.page32
dc.identifier.doi10.6342/NTU201603167
dc.rights.note有償授權
dc.date.accepted2016-08-22
dc.contributor.author-college電機資訊學院zh_TW
dc.contributor.author-dept資訊工程學研究所zh_TW
顯示於系所單位:資訊工程學系

文件中的檔案:
檔案 大小格式 
ntu-105-1.pdf
  未授權公開取用
746.98 kBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved