透過多樣化的權重修剪提升目標攻擊的可轉移性

王竑睿; Hung-Jui Wang

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/87976

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	陳尚澤	zh_TW
dc.contributor.advisor	Shang-Tse Chen	en
dc.contributor.author	王竑睿	zh_TW
dc.contributor.author	Hung-Jui Wang	en
dc.date.accessioned	2023-08-01T16:11:13Z	-
dc.date.available	2023-11-09	-
dc.date.copyright	2023-08-01	-
dc.date.issued	2023	-
dc.date.submitted	2023-06-27	-
dc.identifier.citation	[1] W. Brendel, J. Rauber, and M. Bethge. Decision-based adversarial attacks: Reliable attacks against black-box machine learning models. In 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings. OpenReview.net, 2018. [2] N. Carlini and D. A. Wagner. Towards evaluating the robustness of neural networks. In 2017 IEEE Symposium on Security and Privacy, SP 2017, San Jose, CA, USA, May 22-26, 2017, pages 39–57. IEEE Computer Society, 2017. [3] Y. Chen, X. Yuan, J. Zhang, Y. Zhao, S. Zhang, K. Chen, and X. Wang. Devil's whisper: A general approach for physical adversarial attacks against commercial black-box speech recognition devices. In S. Capkun and F. Roesner, editors, 29th USENIX Security Symposium, USENIX Security 2020, August 12-14, 2020, pages 2667–2684. USENIX Association, 2020. [4] M. Denil, B. Shakibi, L. Dinh, M. Ranzato, and N. de Freitas. Predicting parameters in deep learning. In C. J. C. Burges, L. Bottou, Z. Ghahramani, and K. Q. Weinberger, editors, Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013, Lake Tahoe, Nevada, United States, pages 2148–2156, 2013. [5] Y. Dong, F. Liao, T. Pang, H. Su, J. Zhu, X. Hu, and J. Li. Boosting adversarial attacks with momentum. In 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, June 18-22, 2018, pages 9185-9193. Computer Vision Foundation / IEEE Computer Society, 2018. [6] Y. Dong, T. Pang, H. Su, and J. Zhu. Evading defenses to transferable adversarial examples by translation-invariant attacks. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16-20, 2019, pages 4312–4321. Computer Vision Foundation / IEEE, 2019. [7] A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, and N. Houlsby. An image is worth 16x16 words: Transformers for image recognition at scale. In 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3-7, 2021. OpenReview.net, 2021. [8] Y. Duan, J. Zou, X. Zhou, W. Zhang, J. Zhang, and Z. Pan. Adversarial attack via dual-stage network erosion, 2022. [9] J. Frankle and M. Carbin. The lottery ticket hypothesis: Finding sparse, trainable neural networks. In 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019. OpenReview.net, 2019. [10] M. Gubri, M. Cordy, M. Papadakis, Y. L. Traon, and K. Sen. LGV: boosting adversarial example transferability from large geometric vicinity. In S. Avidan, G. J. Brostow, M. Cissé, G. M. Farinella, and T. Hassner, editors, Computer Vision - ECCV 2022 - 17th European Conference, Tel Aviv, Israel, October 23-27, 2022, Proceedings, Part IV, volume 13664 of Lecture Notes in Computer Science, pages 603–618. Springer, 2022. [11] S. Han, J. Pool, J. Tran, and W. J. Dally. Learning both weights and connections for efficient neural network. In C. Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama, and R. Garnett, editors, Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, December 7-12, 2015, Montreal, Quebec, Canada, pages 1135–1143, 2015. [12] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, June 27-30, 2016, pages 770–778. IEEE Computer Society, 2016. [13] G. Huang, Z. Liu, L. van der Maaten, and K. Q. Weinberger. Densely connected convolutional networks. In 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, July 21-26, 2017, pages 2261–2269. IEEE Computer Society, 2017. [14] Y. Huang and A. W. Kong. Transferable adversarial attack based on integrated gradients. In The Tenth International Conference on Learning Representations, ICLR 2022, Virtual Event, April 25-29, 2022. OpenReview.net, 2022. [15] A. Ilyas, L. Engstrom, A. Athalye, and J. Lin. with limited queries and information. Black-box adversarial attacks In J. G. Dy and A. Krause, editors, Proceedings of the 35th International Conference on Machine Learning, ICML 2018, Stockholmsmässan, Stockholm, Sweden, July 10-15, 2018, volume 80 of Proceedings of Machine Learning Research, pages 2142–2151. PMLR, 2018. [16] N. Inkawhich, K. J. Liang, L. Carin, and Y. Chen. Transferable perturbations of deep feature distributions. In 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020. OpenReview.net, 2020. [17] N. Inkawhich, K. J. Liang, B. Wang, M. Inkawhich, L. Carin, and Y. Chen. Perturbing across the feature hierarchy to improve standard and strict blackbox attack transferability. In H. Larochelle, M. Ranzato, R. Hadsell, M. Balcan, and H. Lin, editors, Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual, 2020. [18] A. Kurakin, I. Goodfellow, S. Bengio, Y. Dong, F. Liao, M. Liang, T. Pang, J. Zhu, X. Hu, C. Xie, J. Wang, Z. Zhang, Z. Ren, A. Yuille, S. Huang, Y. Zhao, Y. Zhao, Z. Han, J. Long, Y. Berdibekov, T. Akiba, S. Tokui, and M. Abe. Adversarial attacks and defences competition, 2018. [19] A. Kurakin, I. J. Goodfellow, and S. Bengio. Adversarial examples in the physical world. In 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Workshop Track Proceedings. OpenReview.net, 2017. [20] Y. LeCun, J. S. Denker, and S. A. Solla. Optimal brain damage. In D. S. Touretzky, editor, Advances in Neural Information Processing Systems 2, [NIPS Conference, Denver, Colorado, USA, November 27-30, 1989], pages 598–605. Morgan Kaufmann, 1989. [21] C. Li, S. Gao, C. Deng, D. Xie, and W. Liu. Cross-modal learning with adversarial samples. In H. M. Wallach, H. Larochelle, A. Beygelzimer, F. d’Alché-Buc, E. B. Fox, and R. Garnett, editors, Advances in Neural Information Processing Systems Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, December 8-14, 2019, Vancouver, BC, Canada, pages 10791–10801, 2019. [22] M. Li, C. Deng, T. Li, J. Yan, X. Gao, and H. Huang. Towards transferable targeted attack. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, June 13-19, 2020, pages 638–646. Computer Vision Foundation / IEEE, 2020. [23] Y. Li, S. Bai, Y. Zhou, C. Xie, Z. Zhang, and A. L. Yuille. Learning transferable adversarial examples via ghost networks. In The Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2020, The Thirty-Second Innovative Applications of Artificial Intelligence Conference, IAAI 2020, The Tenth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2020, New York, NY, USA, February 7-12, 2020, pages 11458–11465. AAAI Press, 2020. [24] J. Lin, C. Song, K. He, L. Wang, and J. E. Hopcroft. Nesterov accelerated gradient and scale invariance for adversarial attacks. In 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020. OpenReview.net, 2020. [25] H. Liu, Z. Dai, D. R. So, and Q. V. Le. Pay attention to mlps. In M. Ranzato, A. Beygelzimer, Y. N. Dauphin, P. Liang, and J. W. Vaughan, editors, Advances in Neural Information Processing Systems Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, December 6-14, 2021, virtual, pages 9204–9215, 2021. [26] Y. Liu, X. Chen, C. Liu, and D. Song. Delving into transferable adversarial examples and black-box attacks. In International Conference on Learning Representations, 2017. [27] Z. Liu, Y. Lin, Y. Cao, H. Hu, Y. Wei, Z. Zhang, S. Lin, and B. Guo. Swin transformer: Hierarchical vision transformer using shifted windows. In 2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021, Montreal, QC, Canada, October 10-17, 2021, pages 9992–10002. IEEE, 2021. [28] Z. Liu, M. Sun, T. Zhou, G. Huang, and T. Darrell. Rethinking the value of network pruning. In 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019. OpenReview.net, 2019. [29] A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu. Towards deep learning models resistant to adversarial attacks. In 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings. OpenReview.net, 2018. [30] M. Naseer, S. H. Khan, M. Hayat, F. S. Khan, and F. Porikli. On generating transferable targeted perturbations. In 2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021, Montreal, QC, Canada, October 10-17, 2021, pages 7688–7697. IEEE, 2021. [31] Y. NESTEROV. A method for unconstrained convex minimization problem with the rate of convergence o (1/k^ 2). In Doklady AN USSR, volume 269, pages 543–547, 1983. [32] Z. Qin, Y. Fan, Y. Liu, L. Shen, Y. Zhang, J. Wang, and B. Wu. Boosting the transferability of adversarial attacks with reverse adversarial perturbation. In A. H. Oh, A. Agarwal, D. Belgrave, and K. Cho, editors, Advances in Neural Information Processing Systems, 2022. [33] H. Salman, A. Ilyas, L. Engstrom, A. Kapoor, and A. Madry. Do adversarially robust imagenet models transfer better? In H. Larochelle, M. Ranzato, R. Hadsell, M. Balcan, and H. Lin, editors, Advances in Neural Information Processing Systems Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual, 2020. [34] R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, and D. Batra. Grad-cam: Visual explanations from deep networks via gradient-based localization. In IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, October 22-29, 2017, pages 618–626. IEEE Computer Society, 2017. [35] C. Shorten and T. M. Khoshgoftaar. A survey on image data augmentation for deep learning. J. Big Data, 6:60, 2019. [36] K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. In Y. Bengio and Y. LeCun, editors, 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings, 2015. [37] N. Srivastava, G. E. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov. Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res., 15(1):1929–1958, 2014. [38] C. Szegedy, S. Ioffe, V. Vanhoucke, and A. A. Alemi. Inception-v4, inception-resnet and the impact of residual connections on learning. In S. Singh and S. Markovitch, editors, Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, February 4-9, 2017, San Francisco, California, USA, pages 4278–4284. AAAI Press, 2017. [39] C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna. Rethinking the inception architecture for computer vision. In 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, June 27-30, 2016, pages 2818–2826. IEEE Computer Society, 2016. [40] I. O. Tolstikhin, N. Houlsby, A. Kolesnikov, L. Beyer, X. Zhai, T. Unterthiner, J. Yung, A. Steiner, D. Keysers, J. Uszkoreit, M. Lucic, and A. Dosovitskiy. Mlp-mixer: An all-mlp architecture for vision. In M. Ranzato, A. Beygelzimer, Y. N. Dauphin, P. Liang, and J. W. Vaughan, editors, Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, December 6-14, 2021, virtual, pages 24261–24272, 2021. [41] H. Touvron, P. Bojanowski, M. Caron, M. Cord, A. El-Nouby, E. Grave, A. Joulin, G. Synnaeve, J. Verbeek, and H. Jégou. Resmlp: Feedforward networks for image classification with data-efficient training. CoRR, abs/2105.03404, 2021. [42] F. Tramèr, A. Kurakin, N. Papernot, I. J. Goodfellow, D. Boneh, and P. D. Mc-Daniel. Ensemble adversarial training: Attacks and defenses. In 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings. OpenReview.net, 2018. [43] X. Wang and K. He. Enhancing the transferability of adversarial attacks through variance tuning. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, virtual, June 19-25, 2021, pages 1924–1933. Computer Vision Foundation / IEEE, 2021. [44] X. Wang, X. He, J. Wang, and K. He. Admix: Enhancing the transferability of adversarial attacks. In 2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021, Montreal, QC, Canada, October 10-17, 2021, pages 16138–16147. IEEE, 2021. [45] C. Xie, Z. Zhang, Y. Zhou, S. Bai, J. Wang, Z. Ren, and A. L. Yuille. Improving transferability of adversarial examples with input diversity. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16-20, 2019, pages 2730–2739. Computer Vision Foundation / IEEE, 2019. [46] Y. Xiong, J. Lin, M. Zhang, J. E. Hopcroft, and K. He. Stochastic variance reduced ensemble adversarial attack for boosting the adversarial transferability. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 14983–14992, June 2022. [47] X. Yang, Y. Dong, T. Pang, H. Su, and J. Zhu. Boosting transferability of targeted adversarial examples via hierarchical generative networks. In S. Avidan, G. J. Brostow, M. Cissé, G. M. Farinella, and T. Hassner, editors, Computer Vision - ECCV 2022 - 17th European Conference, Tel Aviv, Israel, October 23-27, 2022, Proceedings, Part IV, volume 13664 of Lecture Notes in Computer Science, pages 725–742. Springer, 2022. [48] Z. Zhao, Z. Liu, and M. Larson. On success and simplicity: A second look at transferable targeted attacks. In M. Ranzato, A. Beygelzimer, Y. Dauphin, P. Liang, and J. W. Vaughan, editors, Advances in Neural Information Processing Systems, volume 34, pages 6115–6128. Curran Associates, Inc., 2021. [49] J. Zou, Y. Duan, B. Li, W. Zhang, Y. Pan, and Z. Pan. Making adversarial examples more transferable and indistinguishable. In Thirty-Sixth AAAI Conference on Artificial Intelligence, AAAI 2022, Thirty-Fourth Conference on Innovative Applications of Artificial Intelligence, IAAI 2022, The Twelveth Symposium on Educational Advances in Artificial Intelligence, EAAI 2022 Virtual Event, February 22 - March 1, 2022, pages 3662–3670. AAAI Press, 2022.	-
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/87976	-
dc.description.abstract	惡意攻擊者可以透過添加微小的擾動生成目標性對抗例，促使神經網路產生特定的錯誤輸出。在跨模型的遷移性下，即使無法直接存取神經網路模型的參數，模型一樣可能受到對抗例的攻擊。現今研究提出以集成式方法來生成對抗例以增加遷移性。為了更加提升遷移性，模型增強法會增加參與集成式方法的模型數量。然而，現存的模型增強法只在無目標性攻擊的設定上進行。在本作中，我們提出多樣化權重修剪，一個新穎的模型增強法，來產生目標性攻擊。相較於以往的研究，多樣化權重修剪會保護模型中必要的權重，並同時確保修剪後模型的多樣性。我們將在實驗中呈現此對目標性攻擊的重要性。我們在 ImageNet 兼容資料集上，提供了更具挑戰性的設定下的實驗結果：分別是遷移到經過對抗訓練的模型、非捲積神經網路模型以及 Google 雲端電腦視覺服務。結果顯示我們提出的多樣化權重修剪分別能在三個設定下，基於最新方法，提升目標性攻擊成功率10.1%, 6.6%, 以及 7.0%。我們將會在投稿接受後開放程式碼。	zh_TW
dc.description.abstract	Malicious attackers can generate targeted adversarial examples by imposing tiny noises, forcing neural networks to produce specific incorrect outputs. With cross-model transferability, network models remain vulnerable even in black-box settings. Recent studies have shown the effectiveness of ensemble-based methods in generating transferable adversarial examples. To further enhance transferability, model augmentation methods aim to produce more networks participating in the ensemble. However, existing model augmentation methods are only proven effective in untargeted attacks. In this work, we propose Diversified Weight Pruning (DWP), a novel model augmentation technique for generating transferable targeted attacks. DWP leverages the weight pruning method commonly used in model compression. Compared with prior work, DWP protects necessary connections and ensures the diversity of the pruned models simultaneously, which we show are crucial for targeted transferability. Experiments on the ImageNet-compatible dataset under various and more challenging scenarios confirm the effectiveness: transfer to adversarially trained models, Non-CNN architectures, and Google Cloud Vision. The results show that our proposed DWP improves the targeted attack success rates with up to 10.1%, 6.6%, and 7.0% on the combination of state-of-the-art methods, respectively. The source code will be made available after acceptance.	en
dc.description.provenance	Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2023-08-01T16:11:13Z No. of bitstreams: 0	en
dc.description.provenance	Made available in DSpace on 2023-08-01T16:11:13Z (GMT). No. of bitstreams: 0	en
dc.description.tableofcontents	Acknowledgements iii 摘要 v Abstract vii Contents ix List of Figures xiii List of Tables xv Denotation xvii Chapter 1 Introduction 1 Chapter 2 Related Work 5 2.1 Transferable Attack 5 2.1.1 Gradient Optimization 5 2.1.2 Input Transformation 6 2.1.3 Modern Loss Function 6 2.1.4 Ensemble and Model Augmentation 7 2.2 Network Pruning 7 Chapter 3 Methodology 9 3.1 Preliminary and Motivation 9 3.1.1 Momentum and Nesterov Iterative Method (NI) 9 3.1.2 Scale Invariant Method (SI) 10 3.1.3 Diverse Input Patterns (DI) 11 3.1.4 Translation Invariant Method (TI) 11 3.2 Diversified Weight Pruning 12 Chapter 4 Experiments 15 4.1 Experimental Setup 15 4.1.1 Dataset 15 4.1.2 Networks 15 4.1.3 Hyper-parameters 16 4.1.4 Baseline Methods 16 4.2 Transferable Targeted Attack in Various Scenarios 17 4.2.1 Transferring across Naturally Trained CNNs 17 4.2.2 Transferring to Adversarially Trained Models 18 4.2.3 Transferring to Non-CNN Architectures 19 4.2.4 Transferring to Google Cloud Vision 20 4.3 Perturbations from Different Pruned Models 21 4.4 Semantics of the Target Class 23 Chapter 5 Conclusion 27 References 29 Appendix A - Supplementary Material 39 A.1 Ablation Analysis on Prunable Rates 39 A.2 Transferring across CNNs with Similar Architectures 39 A.3 Transferring to Multi-Step Adversarially Trained Models 41 A.4 Untargeted and Single-Model Results 42 A.5 Time Cost of DWP 43 A.6 Compare with VT, SVRE and IG 43 A.7 Results of DWP on Google Cloud Vision 47 A.8 Samples of GradCAM 49	-
dc.language.iso	en	-
dc.subject	神經網路修剪	zh_TW
dc.subject	對抗式攻擊	zh_TW
dc.subject	電腦視覺與圖型識別	zh_TW
dc.subject	Adversarial Attack	en
dc.subject	Network Pruning	en
dc.subject	Computer Vision and Pattern Recognition	en
dc.title	透過多樣化的權重修剪提升目標攻擊的可轉移性	zh_TW
dc.title	Enhancing Targeted Attack Transferability via Diversified Weight Pruning	en
dc.type	Thesis	-
dc.date.schoolyear	111-2	-
dc.description.degree	碩士	-
dc.contributor.oralexamcommittee	陳祝嵩;曹昱	zh_TW
dc.contributor.oralexamcommittee	Chu-Song Chen;Yu Tsao	en
dc.subject.keyword	對抗式攻擊,神經網路修剪,電腦視覺與圖型識別,	zh_TW
dc.subject.keyword	Adversarial Attack,Network Pruning,Computer Vision and Pattern Recognition,	en
dc.relation.page	51	-
dc.identifier.doi	10.6342/NTU202301130	-
dc.rights.note	同意授權(全球公開)	-
dc.date.accepted	2023-06-29	-
dc.contributor.author-college	電機資訊學院	-
dc.contributor.author-dept	資訊工程學系	-
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
ntu-111-2.pdf	14.66 MB	Adobe PDF	檢視/開啟

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。