基於最近鄰居之排列方法於多標籤分類問題

Tsung-Hsien Chiang; 江宗憲

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/66639

標題:	基於最近鄰居之排列方法於多標籤分類問題 A KNN-based Ranking Approach for Multilabel Classification
作者:	Tsung-Hsien Chiang 江宗憲
指導教授:	林守德(Shou-De Lin)
關鍵字:	機器學習,資料探勘,多標籤分類, machine learning,data mining,multi-label classification,
出版年 :	2011
學位:	碩士
摘要:	近幾年來，在機器學習這個領域上，多標籤分類問題越來越受到大家注目。針對多標籤分類問題，本篇研究提出了一個基於最近鄰居之排列方法。利用排序模型重新定義了鄰居們的重要程度，選出哪些鄰居的標籤比較有可能是答案。若是答案的可能性越高，那些鄰居的排序就會越高。根據這個排序，我們使用加權投票的方式來決定最後的答案。關於權重值的決定方式，我們建立了一個最佳化問題。透過解最佳化問題，來尋求各個排名所對應到的權重值應該要是多少。我們從現實世界當中的各個領域收集不同的資料來做實驗。並且與其他有利用到最近鄰居的其他知名演算法做比較。從實驗結果上來看，本方法普遍都可以有不錯的表現。而對於一些問題來說，本方法的結果也略勝於其他利用最近鄰居的知名演算法。根據本篇論文的實驗結果，我們認為若能妥善地利用最近鄰居法，對於解決多標籤分類問題是很有幫助的。 Multi-label classification has attracted a great deal of attention in recent years. This paper presents an interesting finding, namely, being able to identify neighbors with trustable labels can significantly improve the classification accuracy. Based on this finding, we propose a k-nearest-neighbor-based ranking approach to solve the multi-label classification problem. The approach exploits a ranking model to learn which neighbor's labels are more trustable candidates for a weighted KNN-based strategy, and then assigns higher weights to those candidates when making weighted-voting decisions. The weights can then be determined by using a generalized pattern search technique. We collect several real-word data sets from various domains for the experiment. Our experiment results demonstrate that the proposed method outperforms state-of-the-art instance-based learning approaches. We believe that appropriately exploiting k-nearest neighbors is useful to solve the multi-label problem.
URI:	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/66639
全文授權:	有償授權
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
ntu-100-1.pdf 目前未授權公開取用	1.07 MB	Adobe PDF

顯示文件完整紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料（如：文字、圖片、PDF）並使其易於取用。