基於對抗例攻擊之AI模型防禦力評估檢測

Chih-Ling Chang; 張芷苓

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/80238

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	郭斯彥(Sy-Yen Kuo)
dc.contributor.author	Chih-Ling Chang	en
dc.contributor.author	張芷苓	zh_TW
dc.date.accessioned	2022-11-24T03:03:04Z	-
dc.date.available	2021-08-04
dc.date.available	2022-11-24T03:03:04Z	-
dc.date.copyright	2021-08-04
dc.date.issued	2021
dc.date.submitted	2021-07-21
dc.identifier.citation	[1]Ian J. Goodfellow, J. Shlens, and C. Szegedy, “Explaining and harnessing adversarial examples,” in International Conference on Learning Representations (ICLR), 2015. [2]J. T. Springenberg, A. Dosovitskiy, and T. Brox, and M. Riedmiller, “Striving for simplicity: The all convolutional net,” in International Conference on Learning Representations (ICLR), 2015. [3]N. Papernot, F. Faghri, N. Carlini, I. Goodfellow, R. Feinman, A. Kurakin, C. Xie, Y. Sharma, T. Brown, A. Roy, A. Matyasko, V. Behzadan, K. Hambardzumyan, Z. Zhang, Y. Juang, Z. Li, R. Sheatsley, A. Garg, J. Uesato, W. Gierke, Y. Dong, D. Berthelot, P. Hendricks, J. Rauber, R. Long, and P. McDaniel, “Technical Report on the CleverHans v2.1.0 Adversarial Examples Library,” arXiv:1610.00768, 2018. [4]M. Nicolae, M. Sinn, M. N. Tran, A. Rawat, M. Wistuba, V. Zantedeschi, N. Baracaldo, B. Chen, H. Ludwig, I. M. Molloy, and B. Edwards, “Adversarial Robustness Toolbox v0.4.0,” arXiv:1807.01069, 2019. [5]J. Rauber, W. Brendel, and M. Bethge, “Foolbox: A Python toolbox to benchmark the robustness of machine learning models,” arXiv:1707.04131, 2018. [6]L. Schmidt, S. Santurkar, D. Tsipras, K. Talwar, and A. Mądry, “Adversarially robust generalization requires more data,” in Neural Information Processing Systems (NIPS), 2018, pp. 5019-5031. [7]N. Carlini, and D. Wagner, “Towards Evaluating the Robustness of Neural Networks.” in IEEE Symposium on Security and Privacy (SP), 2017, pp.39-57. [8]U. Jang, X. Wu, and S. Jha, “Objective Metrics and Gradient Descent Algorithms for Adversarial Examples in Machine Learning,” in ACSAC 2017 Proceedings of the 33rd Annual Computer Security Applications Conference, 2017. [9]Y. Yang, G. Zhang, Z. Xu, and D. Katabi, “ME-Net: Towards Effective Adversarial Robustness with Matrix Estimation,” in International Conference on Machine Learning (ICML), 7025–7034, 2019. [10]T. Weng, H. Zhang, P. Chen, J. Yi, D. Su, Y. Gao, C. Hsieh, and L. Daniel, “Evaluating the Robustness of Neural Networks: An Extreme Value Theory Approach,” in Sixth International Conference on Learning Representations (ICLR), 2018. [11]W. Xu, D. Evans, and Y. Qi. Feature Squeezing, “Detecting Adversarial Examples in Deep Neural Networks,” in Network and Distributed Systems Security Symposium (NDSS), 2017. [12]C. Guo, M. Rana, M. Cisse, and L. Maaten, “Countering Adversarial Images using Input Transformations,” in International Conference on Learning Representations (ICLR), 2018. [13]U. Shaham, Y. Yamada, and S. Negahban, “Understanding Adversarial Training: Increasing Local Stability of Neural Nets through Robust Optimization,” in Neurocomputing, 2016, pp.195-204. [14]N. Papernot, P. McDaniel, X. Wu, S. Jha, A. Swami, “Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks,” in IEEE Symposium on Security and Privacy (SP), 2016, pp. 22-26. [15]N. Carlini, and D. Wagner, “Defensive Distillation is Not Robust to Adversarial Examples,” in IEEE Symposium on Security and Privacy (SP), 2016. [16]A. Fawzi, S. Moosavi-Dezfooli, and P. Frossard: Robustness of classifiers, “from adversarial to random noise,” in Neural Information Processing Systems (NIPS), 2016. [17]K. Grosse, N. Papernot, P. Manoharan, M. Backes, and P. D. McDaniel, “Adversarial perturbations against deep neural networks for malware classification,” in European Symposium on Research in Computer Security (ESORICS), 2016. [18]M. M. Cisse, Y. Adi, N. Neverova, and J. Keshet, “Houdini: Fooling deep structured visual and speech recognition models with adversarial examples,” in Neural Information Processing Systems (NIPS), 2017. [19]A. Arnab, O. Miksik, and P. H. S. Torr, “On the robustness of semantic segmentation models to adversarial attacks,” in Conference on Computer Vision and Pattern Recognition (CVPR), 2018. [20]A. Bellet and A. Habrard, “Robustness and generalization for metric learning,” in Neurocomputing, 2015, pp.259-267. [21]E. D. Cubuk, B. Zoph, S. Schoenholz, and Q. V. Le, “Intriguing properties of adversarial examples,” arXiv preprint arXiv:1711.02846, 2017. [22]J. Z. Kolter, and E. Wong, “Provable defenses against adversarial examples via the convex outer adversarial polytope,” in International Conference on Learning Representations (ICLR), 2018. [23]A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu, “Towards deep learning models resistant to adversarial attacks,” in International Conference on Learning Representations (ICLR), 2018. [24]A. Raghunathan, J. Steinhardt, and P. Liang, “Certified defenses against adversarial examples,” in International Conference on Learning Representations (ICLR), 2018. [25]A. Sinha, H. Namkoong, and J. Duchi, “Certifying some distributional robustness with principled adversarial training,” in International Conference on Learning Representations (ICLR), 2018. [26]B. Biggio, and F. Roli, “Wild patterns: Ten years after the rise of adversarial machine learning,” in International Conference on Pattern Recognition (ICPR), 2018. [27]C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. J. Goodfellow, and R. Fergus, “Intriguing properties of neural networks,” in International Conference on Learning Representations (ICLR), 2014. [28]R. Jia and P. Liang, “Adversarial examples for evaluating reading comprehension systems,” in Conference on Empirical Methods in Natural Language Processing (EMNLP), 2017. [29]B. Wang, J. Gao, and Y. Qi, “A theoretical framework for robustness of (deep) classifiers under adversarial noise,” in International Conference on Learning Representations (ICLR) Workshops, 2016. [30]C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, and R. Fergus, “Intriguing properties of neural networks,” in International Conference on Learning Representations (ICLR) Poster, 2014. [31]S. M. Moosavi-Dezfooli, A. Fawzi, and P. Frossard, “Deepfool: a simple and accurate method to fool deep neural networks,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 2574–2582. [32]A. Kurakin, I. Goodfellow, and S. Bengio, “Adversarial examples in the physical world,” in International Conference on Learning Representations (ICLR) Workshops, 2017. [33]A. Rozsa, E. M. Rudd, and T. E. Boult, “Adversarial diversity and hard positive generation,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2016, pp. 25–32. [34]N. Papernot, P. McDaniel, S. Jha, M. Fredrikson, Z. B. Celik, and A. Swami, “The limitations of deep learning in adversarial settings,” in Security and Privacy (EuroS P), 2016, pp. 372–387. [35]T. Salimans, I. Goodfellow, W. Zaremba, V. Cheung, A. Radford, and X. Chen, “Improved techniques for training gans,” in Advances in Neural Information Processing Systems (NIPS), 2016, pp. 2234–2242.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/80238	-
dc.description.abstract	近年來神經網絡的對抗例攻擊變得比以往更具影響力和危險性，因此人工智慧(AI)模型對於對抗例攻擊已經不再有強大的防禦能力。本研究提出了一種評估AI模型穩健性的方法。當受到13種類型的對抗性攻擊時，評估了六種常用的圖像分類CNN模型。模型的穩健性是採用相對值的評估方法，並可以作為進一步改進的參考。與之前的相關工作不同的是，我們的算法是可以讓使用者自由選擇神經網絡模型、資料集以及攻擊方式。另外，若有些使用者不想公開自己的模型架構但卻又想評估模型的防禦力，那麼我們就需要建構替代網路來完成評估黑箱模型的工作。由於本研究亦使用自己建構之替代網路模型來做模型防禦力分析，而且對於黑箱模型以及替代網路模型攻擊之誤差非常小，因此攻擊替代網路模型並分析其防禦力是可行且具參考性的。	zh_TW
dc.description.provenance	Made available in DSpace on 2022-11-24T03:03:04Z (GMT). No. of bitstreams: 1 U0001-0807202110592800.pdf: 2062329 bytes, checksum: 8a274139b6b57ffc79f3d6fe07b06f37 (MD5) Previous issue date: 2021	en
dc.description.tableofcontents	致謝 i 摘要 ii Abstract iii Contents iv List of Figures v List of Tables v 1 Introduction 1 2 Related Work 4 2.1 Adversarial Attack 4 2.2 Robustness 6 2.3 Adversarial Example API 7 3 Experiment Process 9 3.1 Datasets 11 3.2 CNN models 11 3.3 Adversarial Attack Methods 15 4 Robustness Evaluation 16 4.1 Attack Accuracy 18 4.2 Dispersion of Attack 19 5 Experimental Process and Result 21 5.1 Robustness Evaluation 21 5.2 Substitute Model Evaluation 26 6 Conclusion 34 References 36
dc.language.iso	en
dc.subject	影像處理	zh_TW
dc.subject	電腦視覺	zh_TW
dc.subject	對抗例攻擊	zh_TW
dc.subject	卷積神經網路	zh_TW
dc.subject	防禦力分析	zh_TW
dc.subject	人工智慧	zh_TW
dc.subject	Robustness evaluation	en
dc.subject	adversarial example	en
dc.subject	adversarial attack	en
dc.subject	artificial intelligence	en
dc.subject	convolution neural network (CNN)	en
dc.subject	computer vision	en
dc.title	基於對抗例攻擊之AI模型防禦力評估檢測	zh_TW
dc.title	Evaluating Robustness of AI Models against Adversarial Attacks	en
dc.date.schoolyear	109-2
dc.description.degree	碩士
dc.contributor.oralexamcommittee	顏嗣鈞(Hsin-Tsai Liu),雷欽隆(Chih-Yang Tseng),游家牧,陳英一
dc.subject.keyword	防禦力分析,卷積神經網路,對抗例攻擊,影像處理,電腦視覺,人工智慧,	zh_TW
dc.subject.keyword	Robustness evaluation,convolution neural network (CNN),adversarial attack,adversarial example,computer vision,artificial intelligence,	en
dc.relation.page	39
dc.identifier.doi	10.6342/NTU202101339
dc.rights.note	同意授權(限校園內公開)
dc.date.accepted	2021-07-21
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	電機工程學研究所	zh_TW
顯示於系所單位：	電機工程學系

文件中的檔案：

檔案	大小	格式
U0001-0807202110592800.pdf 授權僅限NTU校內IP使用（校園外請利用VPN校外連線服務）	2.01 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。