Information Bottleneck應用於深度學習網路之行為特徵分析

Shou-Chun Kao; 高紹鈞

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/56594

標題:	Information Bottleneck應用於深度學習網路之行為特徵分析 Information Bottleneck in DNN Behavior Analysis and its Applications
作者:	Shou-Chun Kao 高紹鈞
指導教授:	吳家麟(Ja-Ling Wu)
關鍵字:	機器學習,類神經網路,資訊理論,資訊瓶頸,網路泛化能力,網路可解釋性, Machine Learning,Neural Networks,Information Theory,Information Bottleneck,Model Generalization,Model Explainability,
出版年 :	2020
學位:	碩士
摘要:	深度類神經網路(Deep Neural Network : DNN)可以說是現代人工智慧崛起的主要技術核心，它能夠透過學習將非常複雜的問題轉換成一個輸出輸入之間的非線性函數關係; 在電腦視覺、語言處理、影像處理等多個領域都屢創佳績，其問題處理的準確率甚至超越了人類的能力。但我們對DNN模型(model)內部的運作原理仍幾乎是一無所知。例如，無法設計出一個衡量標準來幫助我們對於不同的任務建構最合適的模型架構(model architecture)、無法解釋DNN如何從學到的知識進行判斷、對於DNN已知的對抗性攻擊(adversarial attack)無法有穩定性的保證等問題。這也使得DNN被詬病為是一個 '黑盒子'，無法讓人對其做出的決策有絕對的信心。近年來越來越多人嘗試想要為DNN提供運作行為的可解釋性，其中一個較理論的方向就是基於資訊理論(Information Theory)。資訊理論已經在數位通訊、資料壓縮等領域運用已久，研究者們試圖建立起DNN與資訊理論間的連結，來分析甚至進一步地優化DNN，相關論述中又以Tishby等人的資訊瓶頸(Information Bottleneck : IB)最為著名。本文介紹了一些資訊理論與DNN之間的連結及應用，並主要探討基於IB的DNN運作原理分析以及相關文獻中出現過的正反觀點，總結出在不同情況所觀察到的資訊變化未必能真實反映出網路學習到的資訊含量，這會影響到基於資訊理論在DNN上的應用其效能或可行性。 Deep neural networks have become the technical cores of modern artificial intelligence research in recent years. They are able to turn a complex problem into non-linear relation between inputs and outputs by learning, and they have achieved practical success in many tasks, such as computer vision, natural language processing and image processing. Surprisingly, there are many scenarios in which DNNs could even outperform human. Despite their great success, very little is known about inner organization or theoretic principle of DNNs. For instance, we could not find a standard to construct the most appropriate model architecture for specific task or explain how DNNs learn from structured knowledge (training data); nor could we guarantee robustness of DNNs against adversarial attacks. These make DNNs be called “black box”, which makes it difficult for people to have confidence in the decisions being made by DNNs. However, more and more research attempts to offer reasonable explanations of learning with DNNs in recent years, one of these directions is based on information theory which has been applied in digital communication and data compression for many years. Researchers have attempted to connect DNNs and information theory in order to analyze or further optimize DNNs performance. In these related points of view, “information bottleneck” proposed by Tishby et al. is widely used in different applications. The aim of this thesis is to investigate the different perspectives of information bottleneck and to introduce applications or analytical methods in DNNs behavior. In conclusion, we discuss whether there exist genuine physical meanings in observed information variation while training DNNs.
URI:	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/56594
DOI:	10.6342/NTU202001845
全文授權:	有償授權
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
U0001-2407202021543000.pdf 目前未授權公開取用	6.15 MB	Adobe PDF

顯示文件完整紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料（如：文字、圖片、PDF）並使其易於取用。