請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/86525| 標題: | 探索多標籤分類的應用 Investigations in Applying Multilabel Classification |
| 作者: | Li-Chung Lin 林立中 |
| 指導教授: | 林智仁(Chih-Jen Lin) |
| 關鍵字: | 多標籤,分類, multi-label,classification, |
| 出版年 : | 2022 |
| 學位: | 碩士 |
| 摘要: | 在機器學習中,利用基準真相進行預測是自相矛盾的做法。但這般不切實際的實驗設計廣泛在圖表徵學習領域中被使用。利用圖表徵的節點分類多標籤問題中,許多著作假設每個測試數據的標籤數在預測階段為已知。實際應用中這種資訊罕為已知。我們指出這種不恰當的設計已成為此領域的標準。我們詳細調查使用不實際資訊的始末。據分析,利用不實際的資訊很可能高估預測表現。我們指出現有多標籤方法使用上的困難為造成這種情形地的可能原因。我們提出、簡單、有效而實際的多標籤方法以利未來研究。最後我們使用這次機會比較主要的圖表徵學習方法在多標籤的節點分類問題中的表現。 Prediction using the ground truth sounds like an oxymoron in machine learning. However, such an unrealistic setting was used in hundreds, if not thousands of papers in the area of finding graph representations. To evaluate the multi-label problem of node classification by using the obtained representations, many works assume that the number of labels of each test instance is known in the prediction stage. In practice such ground truth information is rarely available, but we point out that such an inappropriate setting is now ubiquitous in this research area. We detailedly investigate why the situation occurs. Our analysis indicates that with unrealistic information, the performance is likely over-estimated. To see why suitable predictions were not used, we identify difficulties in applying some multi-label techniques. For the use in future studies, we propose simple and effective settings without using practically unknown information. Finally, we take this chance to compare major graph-representation learning methods on multi-label node classification. |
| URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/86525 |
| DOI: | 10.6342/NTU202201936 |
| 全文授權: | 同意授權(全球公開) |
| 電子全文公開日期: | 2022-08-18 |
| 顯示於系所單位: | 資訊工程學系 |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| U0001-0108202216193400.pdf | 646.62 kB | Adobe PDF | 檢視/開啟 |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
