探求與分析對抗性擾動中隱藏的人類可識別資訊

門玉仁; Dennis Y. Menn

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/88061

標題:	探求與分析對抗性擾動中隱藏的人類可識別資訊 Discovering and Analyzing Human-Recognizable Information Hidden in Adversarial Perturbations
作者:	門玉仁 Dennis Y. Menn
指導教授:	李琳山 Lin-Shan Lee
關鍵字:	機器學習,類神經網路,資訊安全,對抗性擾動,圖形識別, Machine Learning,Neural Network,Information Security,Adversarial Perturbations,Pattern Recognition,
出版年 :	2023
學位:	碩士
摘要:	在眾多機器學習領域中，類神經網路已展現出其頂尖的性能。然而，研究者也發現在輸入資料中添加微小的擾動，亦即對抗性擾動，就有機會混淆類神經網路的判斷。此一現象意味著，在應用類神經網路至如自動駕駛或語者識別等真實世界的任務時，其安全性和可靠性仍然會因為對抗性擾動的存在而面臨威脅。然而，時至今日，仍沒有實用的演算法能夠有效的防範對抗性擾動，其中一個原因是人們並不理解對抗性擾動的作用機制。本論文提出對抗性擾動含有人類可以識別的資訊，並進一步以實驗證明這是導致類神經網路預測錯誤的一個重要因素。這與過去廣為人知的觀點，認為類神經網路的判斷失誤源於人類無法解讀的資訊，相當不同。本論文的研究還發現了兩種存在於擾動中的效應，即遮掩效應和生成效應，這些效應可以被人類解讀，也會導致模型辨識錯誤。而且，這兩種效應在不同攻擊演算法和資料集中都可以被觀察到。這些發現有機會幫助研究者能夠更深入的解析對抗性擾動的特性，包括其作用機制、可轉移性，以及對抗性訓練如何增強模型的可解釋性，使得吾人得以更深入的了解類神經網路的運作原理，並促進防禦演算法的研發。 Neural networks have achieved state-of-the-art performance in many machine learning tasks. However, researchers have found that adding small perturbations to input data, known as adversarial perturbations, can cause neural networks to make incorrect predictions. This phenomenon signifies that when applying neural networks to real-world applications such as autonomous driving or speaker verification, their safety and reliability still face threats due to adversarial perturbations. However, nowadays, there is still a lack of practical algorithms that can effectively defend against adversarial attacks. One of the reasons for this is that researchers are still unclear about the underlying mechanisms of adversarial perturbations. This paper proposes that adversarial perturbations contain human-recognizable information. Our experiments show that this information is an essential factor leading to prediction errors in neural networks. This finding opposes the widely held belief that adversarial perturbations are unrecognizable to humans. This paper also discovers two effects present in adversarial perturbations: the masking effect and the generation effect. Both effects are human-recognizable and may cause neural networks to make mistakes. More importantly, these effects exist among different attack algorithms and datasets. Our findings may help researchers gain a deeper understanding of the nature of adversarial perturbations, including their working mechanism, transferability, and how adversarial training enhances the interpretability of models, etc., leading to a deeper understanding of neural networks and the development of more effective defensive algorithms.
URI:	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/88061
DOI:	10.6342/NTU202301498
全文授權:	同意授權(全球公開)
顯示於系所單位：	電信工程學研究所

文件中的檔案：

檔案	大小	格式
ntu-111-2.pdf	26.34 MB	Adobe PDF	檢視/開啟

顯示文件完整紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料（如：文字、圖片、PDF）並使其易於取用。