基於轉移學習運用有雜訊的資訊處理分類問題

Wei-Shih Lin; 林瑋詩

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/61538

標題:	基於轉移學習運用有雜訊的資訊處理分類問題 A Transfer-Learning Approach to Exploit Noisy Information for Classification
作者:	Wei-Shih Lin 林瑋詩
指導教授:	林守德(Shou-De Lin)
關鍵字:	轉移學習,情感預測, Transfer Learning,Feature Transfer,Sentiment Prediction,Novel Topics,
出版年 :	2013
學位:	碩士
摘要:	一直以來，豐富且大量的正確標記資料是一件費時又費力的工程；雖然我們可以利用自動化的方法標記海量的資料，但正確性卻也令人堪憂。因此，我們提出了一個同時使用正確但少量及大量但有雜訊資料的演算法。利用被標記過兩次以上的資料當作橋梁，藉以求出每筆資料的權重以及特徵值的轉換公式。之後，我們將演算法實驗於三個人造資料及一個現實生活實用的問題：新議題下的情感傳遞預測。有別於傳統的情感預測問題，新議題下因為缺乏歷史文字資料，因此更為艱難。最後，經實驗證明，本研究提出之演算法在不同四種資料庫下表現皆優於其他各種方法。 Generally qualitative condition (the accuracy of the data) and quantitative condition (the amount of data) of the data can significantly affect the quality of a supervised learning model. However, in real-world applications it might not be feasible to always assume one can obtain large amount of high-quality datasets. This research assumes the situation that there is a only small amount of accurate training data available for learning, aiming at designing a transfer-learning based approach to utilize larger amount of noisy (in terms of labels and features) training data to improve the learning quality. This problem is non-trivial because the distribution in noisy training dataset is different from that of the testing data. In this thesis, we proposed a novel transfer learning algorithm, Noise-Label Transfer Learning (NLTL), to solve the problem. We exploit the information of labels and features from accurate and noise data, transferring the features into same domain and adjusting the weights of instances for learning. The experiment result shows NLTL could outperform the existing approaches.
URI:	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/61538
全文授權:	有償授權
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
ntu-102-1.pdf 目前未授權公開取用	1.31 MB	Adobe PDF

顯示文件完整紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料（如：文字、圖片、PDF）並使其易於取用。