Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 電信工程學研究所
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/88061
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor李琳山zh_TW
dc.contributor.advisorLin-Shan Leeen
dc.contributor.author門玉仁zh_TW
dc.contributor.authorDennis Y. Mennen
dc.date.accessioned2023-08-08T16:07:10Z-
dc.date.available2023-11-09-
dc.date.copyright2023-08-08-
dc.date.issued2023-
dc.date.submitted2023-07-13-
dc.identifier.citationYann LeCun, Yoshua Bengio, and Geoffrey Hinton. Deep learning. Nature, 521(7553):436–444, 2015.
Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton. Imagenet classification with deep convolutional neural networks. Communications of the Association for Computing Machinery, 60:84–90, 2017.
Geoffrey Hinton, Li Deng, Dong Yu, et al. Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups. IEEE Signal Processing Magazine, 29:82–97, 2012.
Ashish Vaswani, Noam Shazeer, Niki Parmar, et al. Attention is all you need. In Advances in Neural Information Processing Systems, 2017.
Claudine Badue, Rânik Guidolini, Vivacqua Carneiro, et al. Self-driving cars: A survey. arXiv, 2019.
Mohsin Kabir, F. Mridha, Jungpil Shin, et al. A survey of speaker recognition: Fundamental theories, recognition methods and opportunities. IEEE Access, 9:79236–79263, 2021.
Iacopo Masi, Yue Wu, Tal Hassner, et al. Deep face recognition: A survey. In Graphics, Patterns and Images, 2018.
Litjens Geert, Kooi Thijs, Ehteshami Babak, et al. A survey of deep learning in medical image analysis. Medical Image Analysis, 42:60–88, 2017.
Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, et al. Intriguing properties of neural networks. In International Conference on Learning Representations, 2014.
Kevin Eykholt, Ivan Evtimov, Earlence Fernandes, et al. Robust physical-world attacks on deep learning visual classification. In IEEE Conference on Computer Vision and Pattern Recognition, 2018.
Alexey Kurakin, Ian Goodfellow, and Samy Bengio. Adversarial examples in the physical world. arXiv, 2016.
Ian J Goodfellow, Jonathon Shlens, and Christian Szegedy. Explaining and harnessing adversarial examples. In International Conference on Learning Representations, 2015.
Seyed-Mohsen Moosavi-Dezfooli, Alhussein Fawzi, and Pascal Frossard. Deepfool: a simple and accurate method to fool deep neural networks. In IEEE Conference on Computer Vision and Pattern Recognition, 2016.
Nicholas Carlini and David Wagner. Towards evaluating the robustness of neural networks. In IEEE Symposium on Security and Privacy, 2017.
Maksym Andriushchenko, Francesco Croce, Nicolas Flammarion, et al. Square attack: A query-efficient black-box adversarial attack via random search for the l2 norm. In International Conference on Machine Learning, 2020.
SM Moosavi-Dezfooli, Alhussein Fawzi, and Pascal Frossard. Universal adversarial perturbations. In IEEE Conference on Computer Vision and Pattern Recognition, 2017.
David Rumelhart, Geoffrey Hinton, and Ronald Williams. Learning representations by back-propagating errors. Nature, 323(6088):533–536, 1986.
Nicolas Papernot, Patrick McDaniel, Somesh Jha, et al. Practical black-box attacks against deep learning systems using adversarial examples. In Asia Conference on Computer and Communications Security, 2017.
Andrew Ilyas, Abdelrahman Jalal, Carsten Etmann, et al. A fourier perspective on model robustness in computer vision. In IEEE Conference on Computer Vision and Pattern Recognition, 2019.
Nicolas Papernot, Patrick McDaniel, Xi Wu, et al. Distillation as a defense to adversarial perturbations against deep neural networks. In IEEE Symposium on Security and Privacy, 2016.
Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. Distilling the knowledge in a neural network. arXiv, 2015.
Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, et al. Towards deep learning models resistant to adversarial attacks. In International Conference on Learning Representations, 2018.
Kui Ren, Tianhang Zheng, Zhan Qin, et al. Adversarial attacks and defenses in deep learning. Engineering, 6:346–360, 2020.
Xuanqing Liu, Minhao Cheng, Huan Zhang, et al. Towards robust neural networks via random self-ensemble. In European Conference on Computer Vision, 2017.
Pouya Samangouei, Maya Kabkab, and Rama Chellappa. Defense-GAN: Protecting classifiers against adversarial attacks using generative models. In International Conference on Learning Representations, 2018.
Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, et al. Generative adversarial nets. In Advances in Neural Information Processing Systems, 2014.
Naveed Akhtar, Ajmal Mian, Navid Kardan, et al. Advances in adversarial attacks and defenses in computer vision: A survey. IEEE Access, 9:155161–155196, 2021.
Nicolas Papernot, Patrick McDaniel, and Ian Goodfellow. Transferability in machine learning: From phenomena to black-box attacks using adversarial samples. arXiv,2016.
Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman. Deep inside convolutional networks: visualising image classification models and saliency maps. In International Conference on Learning Representations Workshop, 2014.
Daniel Smilkov, Nikhil Thorat, Been Kim, et al. Smoothgrad: removing noise by adding noise. In Workshop on Visualization for Deep Learning, 2017.
Dimitris Tsipras, Shibani Santurkar, Logan Engstrom, et al. Robustness may be at odds with accuracy. In International Conference on Learning Representations, 2019.
Andrew Ross and Finale Doshi-Velez. Improving the adversarial robustness and interpretability of deep neural networks by regularizing their input gradients. In Association for the Advancement of Artificial Intelligence Conference, 2018.
Shibani Santurkar, Andrew Ilyas, Dimitris Tsipras, et al. Image synthesis with a single (robust) classifier. In Advances in Neural Information Processing Systems, 2019.
A. Davies, P. Veličković, L. Buesing, et al. Advancing mathematics by guiding human intuition with AI. Nature, 600:70–74, 2021.
Andrew Ilyas, Shibani Santurkar, Dimitris Tsipras, et al. Adversarial examples are not bugs, they are features. In Advances in Neural Information Processing Systems, 2019.
Florian Tramèr, Nicolas Papernot, Ian Goodfellow, et al. The space of transferable adversarial examples. arXiv, 2017.
Athalye Anish, Carlini Nicholas, and Wagner David. Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples. In International Conference on Machine Learning, 2018.
Jia Deng, Wei Dong, Richard Socher, et al. ImageNet: A large-scale hierarchical image database. In IEEE Conference on Computer Vision and Pattern Recognition, 2009.
Yanpei Liu, Xinyun Chen, Chang Liu, et al. Delving into transferable adversarial examples and black-box attacks. In International Conference on Learning Representations, 2017.
Thomas Tanay and Lewis Griffin. A boundary tilting perspective on the phenomenon of adversarial examples. arXiv, 2016.
Yang Song, Taesup Kim, Sebastian Nowozin, et al. PixelDefend: Leveraging generative models to understand and defend against adversarial examples. In International Conference on Learning Representations, 2018.
Ludwig Schmidt, Shibani Santurkar, Dimitris Tsipras, et al. Adversarially robust generalization requires more data. In Advances in Neural Information Processing Systems, 2018.
Matthias Hein and Maksym Andriushchenko. Formal guarantees on the robustness of a classifier against adversarial manipulation. In Advances in Neural Information Processing Systems, 2017.
Henry Gouk, Eibe Frank, Bernhard Pfahringer, et al. Regularisation of neural networks by enforcing Lipschitz continuity. arXiv, 2018.
Cem Anil, James Lucas, and Roger Grosse. Sorting out lipschitz function approximation. In International Conference on Machine Learning, 2019.
Dennis Menn and Hung-yi Lee. Searching for the essence of adversarial perturbations. arXiv, 2022.
Gamaleldin Elsayed, Shreya Shankar, Brian Cheung, et al. Adversarial examples that fool both computer vision and time-limited humans. In Advances in Neural Information Processing Systems, 2018.
Anish Athalye, Logan Engstrom, Andrew Ilyas, et al. Synthesizing robust adversarial examples. CoRR, 2017.
Tom Brown, Dandelion Mané, Aurko Roy, et al. Adversarial patch. arXiv, 2017.
Chaoning Zhang, Philipp Benz, Chenguo Lin, et al. A survey on universal adversarial attack. In International Joint Conference on Artificial Intelligence, 2021.
Matthew D Zeiler and Rob Fergus. Visualizing and understanding convolutional networks. In European Conference on Computer Vision, 2014.
Guillermo Ortiz-Jimenez, Apostolos Modas, Seyed-Mohsen Moosavi-Dezfooli,et al. Optimism in the face of adversity: Understanding and improving deep learning through adversarial robustness. arXiv, 2021.
Deep learning fundamentals. https://deeplizard.com.
Robert Geirhos, Carlos Temme, Jonas Rauber, et al. Generalisation in humans and deep neural networks. In Advances in Neural Information Processing Systems, 2018.
Chiyuan Zhang, Samy Bengio, Moritz Hardt, et al. Understanding deep learning requires rethinking generalization. In International Conference on Learning Representations, 2017.
Li Deng. The MNIST database of handwritten digit images for machine learning research. IEEE Signal Processing Magazine, 29(6):141–142, 2012.
Alex Krizhevsky and Geoffrey Hinton. Learning multiple layers of features from tiny images. https://www.cs.toronto.edu/~kriz/ learning-features-2009-TR.pdf, 2009.
Karen Simonyan and Andrew Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv, 2014.
Oleg Sémery. Computer vision models on PyTorch. https://github.com/osmr/imgclsmob, 2018.
Kaiming He, Xiangyu Zhang, Shaoqing Ren, et al. Deep residual learning for image recognition. In IEEE Conference on Computer Vision and Pattern Recognition, 2016.
Gao Huang, Zhuang Liu, Laurens Van Der Maaten, et al. Densely connected convolutional networks. In IEEE Conference on Computer Vision and Pattern Recognition, 2017.
Zhongzhan Huang, Senwei Liang, Mingfu Liang, et al. DIANet: Dense-and-implicit attention network. arXiv, 2019.
Dongyoon Han, Jiwhan Kim, and Junmo Kim. Deep pyramidal residual networks. In IEEE Conference on Computer Vision and Pattern Recognition, 2017.
Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, et al. Rethinking the inception architecture for computer vision. In IEEE Conference on Computer Vision and Pattern Recognition, 2016.
Dennis Menn and Hung-yi Lee. Searching for the essence of adversarial perturbations. arXiv, 2023.
Robert Hoffman, Shane Mueller, Gary Klein, and Jordan Litman. Metrics for explainable ai: Challenges and prospects. arXiv, 2019.
Shanghua Gao, Zhong-Yu Li, Ming-Hsuan Yang, et al. Large-scale unsupervised semantic segmentation, 2022.
Amaury Bréhéret. Pixel Annotation Tool. https://github.com/abreheret/PixelAnnotationTool, 2017.
G. Bradski et al. The opencv library. https://opencv.org, 2000.
Augustus Odena, Vincent Dumoulin, and Chris Olah. Deconvolution and checkerboard artifacts. Distill, 2016.
Carl-Johann Simon-Gabriel, Yann Ollivier, Leon Bottou, et al. First-order adversarial vulnerability of neural networks and input dimension. In International Conference on Machine Learning, 2019.
-
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/88061-
dc.description.abstract在眾多機器學習領域中,類神經網路已展現出其頂尖的性能。然而,研究者也發現在輸入資料中添加微小的擾動,亦即對抗性擾動,就有機會混淆類神經網路的判斷。此一現象意味著,在應用類神經網路至如自動駕駛或語者識別等真實世界的任務時,其安全性和可靠性仍然會因為對抗性擾動的存在而面臨威脅。然而,時至今日,仍沒有實用的演算法能夠有效的防範對抗性擾動,其中一個原因是人們並不理解對抗性擾動的作用機制。
本論文提出對抗性擾動含有人類可以識別的資訊,並進一步以實驗證明這是導致類神經網路預測錯誤的一個重要因素。這與過去廣為人知的觀點,認為類神經網路的判斷失誤源於人類無法解讀的資訊,相當不同。
本論文的研究還發現了兩種存在於擾動中的效應,即遮掩效應和生成效應,這些效應可以被人類解讀,也會導致模型辨識錯誤。而且,這兩種效應在不同攻擊演算法和資料集中都可以被觀察到。
這些發現有機會幫助研究者能夠更深入的解析對抗性擾動的特性,包括其作用機制、可轉移性,以及對抗性訓練如何增強模型的可解釋性,使得吾人得以更深入的了解類神經網路的運作原理,並促進防禦演算法的研發。
zh_TW
dc.description.abstractNeural networks have achieved state-of-the-art performance in many machine learning tasks. However, researchers have found that adding small perturbations to input data, known as adversarial perturbations, can cause neural networks to make incorrect predictions. This phenomenon signifies that when applying neural networks to real-world applications such as autonomous driving or speaker verification, their safety and reliability still face threats due to adversarial perturbations. However, nowadays, there is still a lack of practical algorithms that can effectively defend against adversarial attacks. One of the reasons for this is that researchers are still unclear about the underlying mechanisms of adversarial perturbations.
This paper proposes that adversarial perturbations contain human-recognizable information. Our experiments show that this information is an essential factor leading to prediction errors in neural networks. This finding opposes the widely held belief that adversarial perturbations are unrecognizable to humans.
This paper also discovers two effects present in adversarial perturbations: the masking effect and the generation effect. Both effects are human-recognizable and may cause neural networks to make mistakes. More importantly, these effects exist among different attack algorithms and datasets.
Our findings may help researchers gain a deeper understanding of the nature of adversarial perturbations, including their working mechanism, transferability, and how adversarial training enhances the interpretability of models, etc., leading to a deeper understanding of neural networks and the development of more effective defensive algorithms.
en
dc.description.provenanceSubmitted by admin ntu (admin@lib.ntu.edu.tw) on 2023-08-08T16:07:10Z
No. of bitstreams: 0
en
dc.description.provenanceMade available in DSpace on 2023-08-08T16:07:10Z (GMT). No. of bitstreams: 0en
dc.description.tableofcontents致謝 i
摘要 iii
Abstract v
目錄 vii
圖目錄 xi
表目錄 xiii
第一章 導論 1
1.1 研究動機 1
1.2 研究方向 2
1.3 主要貢獻 3
1.4 章節安排 3
第二章 背景知識 5
2.1 對抗性擾動 5
2.2 攻擊演算法 6
2.2.1 白盒攻擊 7
2.2.2 黑盒攻擊 9
2.3 防禦演算法 10
2.4 擾動的特性及現象 14
2.5 本章總結 17
第三章 相關研究 19
3.1 類神經網路的局部線性 19
3.2 對抗性實例偏離資料分佈 21
3.3 擾動改動非強健性特徵 22
3.4 數學證明擾動的存在 23
3.5 本章總結 23
第四章 探求擾動的本質 25
4.1 本論文所提出之假設 25
4.2 本論文提出假設之相關研究 26
4.3 本論文所提之假設面對的挑戰 27
4.3.1 擾動的可識別性 27
4.3.2 通用對抗性擾動的存在 28
4.3.3 問題的複雜性 29
4.3.4 對於類神經網路的不理解 29
4.4 解析可識別資訊 29
4.5 實驗方法 30
4.6 實驗設定 31
4.6.1 輸入資料 31
4.6.2 產生擾動 32
4.6.2.1 單模型設定 32
4.6.2.2 雜訊多模型設定 32
4.6.3 模型架構 32
4.6.3.1 MNIST 的模型架構 33
4.6.3.2 CIFAR10 的模型架構 33
4.6.3.3 ImageNet 的模型架構 33
4.6.4 攻擊參數 34
4.6.4.1 基礎迭代攻擊法 34
4.6.4.2 CW 氏攻擊 35
4.6.4.3 深層懸弄攻擊 35
4.6.5 高斯雜訊 35
4.6.6 呈現擾動 36
4.7 改良演算法 37
4.7.1 裁剪效應 37
4.7.2 校正輸出值 37
4.7.3 加速演算法 38
4.8 實驗結果 39
4.8.1 無特定目標攻擊 39
4.8.2 擾動的可識別性 41
4.8.2.1 機器評估 41
4.8.2.2 人類評估 42
4.8.3 評估攻擊強度 43
4.8.4 遮掩效應的影響 44
4.8.5 有特定目標攻擊 53
4.9 本章總結 54
第五章 實驗觀察 57
5.1 搜尋式攻擊 57
5.2 雜訊的影響 59
5.2.1 移除雜訊 60
5.2.2 探討標準差的影響 61
5.2.3 格狀條紋的顯現 61
5.2.4 細節資訊的消失 62
5.3 擾動的特性 62
5.3.1 互補性 62
5.3.2 比較生成的擾動 63
5.4 消失的貓 65
5.4.1 本章總結 68
第六章 綜合討論 69
6.1 類神經網路的脆弱性 69
6.2 擾動的可轉移性 69
6.3 模型的可解釋性 70
6.4 非穩健性特徵的作用 71
6.5 本章總結 71
第七章 結論與展望 73
7.1 研究總結 73
7.2 未來展望 74
參考文獻 75
-
dc.language.isozh_TW-
dc.subject機器學習zh_TW
dc.subject資訊安全zh_TW
dc.subject類神經網路zh_TW
dc.subject對抗性擾動zh_TW
dc.subject圖形識別zh_TW
dc.subjectNeural Networken
dc.subjectInformation Securityen
dc.subjectAdversarial Perturbationsen
dc.subjectMachine Learningen
dc.subjectPattern Recognitionen
dc.title探求與分析對抗性擾動中隱藏的人類可識別資訊zh_TW
dc.titleDiscovering and Analyzing Human-Recognizable Information Hidden in Adversarial Perturbationsen
dc.typeThesis-
dc.date.schoolyear111-2-
dc.description.degree碩士-
dc.contributor.oralexamcommittee李宏毅;陳尚澤;賴穎暉;王新民zh_TW
dc.contributor.oralexamcommitteeHung-Yi Lee;Shang-Tse Chen;Ying-Hui Lai;Hsin-Min Wangen
dc.subject.keyword機器學習,類神經網路,資訊安全,對抗性擾動,圖形識別,zh_TW
dc.subject.keywordMachine Learning,Neural Network,Information Security,Adversarial Perturbations,Pattern Recognition,en
dc.relation.page83-
dc.identifier.doi10.6342/NTU202301498-
dc.rights.note同意授權(全球公開)-
dc.date.accepted2023-07-14-
dc.contributor.author-college電機資訊學院-
dc.contributor.author-dept電信工程學研究所-
顯示於系所單位:電信工程學研究所

文件中的檔案:
檔案 大小格式 
ntu-111-2.pdf26.34 MBAdobe PDF檢視/開啟
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved