通過兩院制投票框架來使用未標記數據增強模型

Yu-Tung Hsieh; 謝雨桐

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/8240

標題:	通過兩院制投票框架來使用未標記數據增強模型 A Bicameralism Voting Framework to Enhance a Model Using Unlabeled Data
作者:	Yu-Tung Hsieh 謝雨桐
指導教授:	劉邦鋒(Pangfeng Liu)
關鍵字:	機器學習,深度學習,聯合式學習,遷移學習,捲積模型,行動裝置, Machine Learning,Deep Learning,Federated Learning,Transfer Learning,CNN,Mobile device,
出版年 :	2020
學位:	碩士
摘要:	在這篇論文中，我們提出了兩院制投票，他可以用來提升深度學習網絡的正確率。我們常常會因為蒐集到新的資料，而想利用這些資料來增強原本就已經訓練好的深度學習網絡。但要把擁有的這些資料拿來重新訓練這個模型，會花非常多的時間。而我們提出的架構可以透過行動裝置來蒐集資料，並直接在行動裝置上透過遷移學習來訓練模型。接著我們將各個行動裝置上的模型收回來，並讓他們各自做預測，我們便可透過投票的方式利用這些預測，來達到更好的預測結果。我們提出的兩院制投票和聯合式學習不同，他沒有將行動裝置上各個模型的權重平均，而是讓它們用投票的方式決定結果。另外我們的架構還可以利用未標記數據來提升模型。只要在這個架構中放入過濾器，我們就可以讓模型達到一個不錯的準確度，只比用標記數據訓練出來的模型差一點。我們的兩院制投票有三個主要優點。第一，兩院制投票機制讓模型的準確度提高了許多。在使用VGG-19模型和Food-101數據庫的情況下，他可以達到77.838%的正確綠，比使用相同資料量訓練在單一模型上的準確度還要高(75.517%)。第二，兩院制投票節省了運算資源，因為兩院制投票只是更新現有的模型，並且運算過程是可以平行化的。例如我們透過遷移學習來訓練一個模型，在伺服器上僅需10分鐘的時間，但若用原有資料加上新資料，從新訓練一個完整的模型，需要花大約一週的時間。最後，兩院制投票相對於聯合式學習更有彈性。兩院制投票可以用任何結構的模型、任何的資料前處理、任何的模型格式訓練在不同的行動裝置上。 In this paper, we propose a em bicameralism voting to improve the accuracy of a deep learning network. After we train a deep learning network with existing data, we may want to improve it with some newly collected data.However, it would be time consuming if we retrain the model with all the available data. Instead, we propose a collective framework that train models on mobile devices with new data (also collected from the mobile devices) via transfer learning. Then we collect the predictions from these new models from the mobile devices, and achieve more accurate predictions by combining their predictions via em voting. The proposed bicameralism voting is different from federated learning, since we do not average the weights of models from mobile devices, but let them vote by bicameralism.In addition, we use bicameralism voting framework to enhance a model by unlabeled data.With a filter in this framework, we can achieve a reasonably good accuracy compared to the accuracy from the model trained by labeled data. The proposed bicameralism voting mechanism has three advantages. First, this collective mechanism improves the accuracy of the deep learning model. The accuracy of bicameralism voting (VGG-19 on the data set Food-101 dataset) is 77.838%, higher than that of a single model (75.517%) with the same amount of training data. Second, the bicameralism voting saves computation resource, because it only updates an existing model, and can be done in parallel by multiple devices.For example, in our experiments to update an existing model via transfer learning takes about 10 minutes on a server, but to train a model from scratch with both the original and the new data will take more than a week. Finally, the bicameralism voting is flexible. Unlike federated learning, bicameralism voting can use any architecture of model, any preprocessing of input data, and any format of model when the models are trained on different mobile devices.
URI:	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/8240
DOI:	10.6342/NTU202003100
全文授權:	同意授權(全球公開)
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
U0001-1208202015205000.pdf	892.17 kB	Adobe PDF	檢視/開啟

顯示文件完整紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料（如：文字、圖片、PDF）並使其易於取用。