強化學習之反學習

楊凱恩; Kai-En Yang

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/98342

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	孫紹華	zh_TW
dc.contributor.advisor	Shao-Hua Sun	en
dc.contributor.author	楊凱恩	zh_TW
dc.contributor.author	Kai-En Yang	en
dc.date.accessioned	2025-08-04T16:05:41Z	-
dc.date.available	2025-08-05	-
dc.date.copyright	2025-08-04	-
dc.date.issued	2025	-
dc.date.submitted	2025-08-01	-
dc.identifier.citation	[1] K. Arulkumaran, M. P. Deisenroth, M. Brundage, and A. A. Bharath. Deep reinforcement learning: A brief survey. IEEE Signal Processing Magazine, 34(6):26–38, 2017. [2] M. G. Bellemare, W. Dabney, and R. Munos. A distributional perspective on reinforcement learning. In International conference on machine learning, pages 449–458. PMLR, 2017. [3] L. Bourtoule, V. Chandrasekaran, C. A. Choquette-Choo, H. Jia, A. Travers, B. Zhang, D. Lie, and N. Papernot. Machine unlearning. In 2021 IEEE Symposium on Security and Privacy (SP), pages 141–159. IEEE, 2021. [4] Y. Cao and J. Yang. Towards making systems forget with machine unlearning. In 2015 IEEE symposium on security and privacy, pages 463–480. IEEE, 2015. [5] N. Carlini, J. Hayes, M. Nasr, M. Jagielski, V. Sehwag, F. Tramer, B. Balle, D. Ippolito, and E. Wallace. Extracting training data from diffusion models. In 32nd USENIX Security Symposium (USENIX Security 23), pages 5253–5270, 2023. [6] N. Carlini, C. Liu, Ú. Erlingsson, J. Kos, and D. Song. The secret sharer: Evaluating and testing unintended memorization in neural networks. In 28th USENIX security symposium (USENIX security 19), pages 267–284, 2019. [7] E. Chien, H. Wang, Z. Chen, and P. Li. Langevin unlearning: A new perspective of noisy gradient descent for machine unlearning. arXiv preprint arXiv:2401.10371, 2024. [8] R. Chourasia and N. Shah. Forget unlearning: Towards true data-deletion in machine learning. In International Conference on Machine Learning, pages 6028–6073. PMLR, 2023. [9] R. Eldan and M. Russinovich. Who’s harry potter? approximate unlearning in llms. arXiv preprint arXiv:2310.02238, 2023. [10] R. Gandikota, J. Materzynska, J. Fiotto-Kaufman, and D. Bau. Erasing concepts from diffusion models. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 2426–2436, 2023. [11] A. Golatkar, A. Achille, and S. Soatto. Eternal sunshine of the spotless net: Selective forgetting in deep networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9304–9312, 2020. [12] A. Golatkar, A. Achille, and S. Soatto. Forgetting outside the box: Scrubbing deep networks of information accessible from input-output observations. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXIX 16, pages 383–398. Springer, 2020. [13] L. Graves, V. Nagisetty, and V. Ganesh. Amnesiac machine learning. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 11516–11524, 2021. [14] C. Guo, T. Goldstein, A. Hannun, and L. Van Der Maaten. Certified data removal from machine learning models. arXiv preprint arXiv:1911.03030, 2019. [15] S. Huang, R. F. J. Dossa, C. Ye, J. Braga, D. Chakraborty, K. Mehta, and J. G. Araújo. Cleanrl: High-quality single-file implementations of deep reinforcement learning algorithms. Journal of Machine Learning Research, 23(274):1–18, 2022. [16] L. P. Kaelbling, M. L. Littman, and A. W. Moore. Reinforcement learning: A survey. Journal of artificial intelligence research, 4:237–285, 1996. [17] M. Kielo and V. Lukin. It’s time to move on: Primacy bias and why it helps to forget. In ICLR Blogposts 2024, 2024. [18] S. Kodge, G. Saha, and K. Roy. Deep unlearning: Fast and efficient gradient-free class forgetting. Transactions on Machine Learning Research, 2023. [19] M. Kurmanji, P. Triantafillou, J. Hayes, and E. Triantafillou. Towards unbounded machine unlearning. In Advances in neural information processing systems, 2024. [20] J. Leike, M. Martic, V. Krakovna, P. A. Ortega, T. Everitt, A. Lefrancq, L. Orseau, and S. Legg. Ai safety gridworlds. arXiv preprint arXiv:1711.09883, 2017. [21] G. Li, H. Hsu, R. Marculescu, et al. Machine unlearning for image-to-image generative models. arXiv preprint arXiv:2402.00351, 2024. [22] Q. Li, Z. Peng, L. Feng, Q. Zhang, Z. Xue, and B. Zhou. Metadrive: Composing diverse driving scenarios for generalizable reinforcement learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022. [23] H. W. Lilliefors. On the kolmogorov-smirnov test for normality with mean and variance unknown. Journal of the American statistical Association, 62(318):399–402, 1967. [24] F. Lu, M. Chen, H. Hsu, P. Deshpande, C. Y. Wang, and B. MacIntyre. Adaptive 3d ui placement in mixed reality using deep reinforcement learning. In Extended Abstracts of the CHI Conference on Human Factors in Computing Systems, pages 1–7, 2024. [25] V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski, et al. Human-level control through deep reinforcement learning. nature, 518(7540):529–533, 2015. [26] S. Neel, A. Roth, and S. Sharifi-Malvajerdi. Descent-to-delete: Gradient-based methods for machine unlearning. In Algorithmic Learning Theory, pages 931–962. PMLR, 2021. [27] T. T. Nguyen, T. T. Huynh, Z. Ren, P. L. Nguyen, A. W.-C. Liew, H. Yin, and Q. V. H. Nguyen. A survey of machine unlearning. arXiv preprint arXiv:2209.02299, 2022. [28] E. Nikishin, M. Schwarzer, P. D＇Oro, P.-L. Bacon, and A. Courville. The primacy bias in deep reinforcement learning. In International conference on machine learning, pages 16828–16847. PMLR, 2022. [29] M. L. Puterman and M. C. Shin. Modified policy iteration algorithms for discounted markov decision problems. Management Science, 24(11):1127–1137, 1978. [30] J. Schulman, P. Moritz, S. Levine, M. Jordan, and P. Abbeel. High-dimensional continuous control using generalized advantage estimation. arXiv preprint arXiv:1506.02438, 2015. [31] J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347, 2017. [32] S. Shastri, M. Wasserman, and V. Chidambaram. The seven sins of {Personal-Data} processing systems under {GDPR}. In 11th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud 19), 2019. [33] G. Somepalli, V. Singla, M. Goldblum, J. Geiping, and T. Goldstein. Diffusion art or digital forgery? investigating data replication in diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6048–6058, 2023. [34] R. S. Sutton. Learning to predict by the methods of temporal differences. Machine learning, 3:9–44, 1988. [35] R. S. Sutton. Reinforcement learning: An introduction. A Bradford Book, 2018. [36] R. S. Sutton, D. McAllester, S. Singh, and Y. Mansour. Policy gradient methods for reinforcement learning with function approximation. Advances in neural information processing systems, 12, 1999. [37] K. Tirumala, A. Markosyan, L. Zettlemoyer, and A. Aghajanyan. Memorization without overfitting: Analyzing the training dynamics of large language models. Advances in Neural Information Processing Systems, 35:38274–38290, 2022. [38] H. Touvron, L. Martin, K. Stone, P. Albert, A. Almahairi, Y. Babaei, N. Bashlykov, S. Batra, P. Bhargava, S. Bhosale, et al. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288, 2023. [39] M. Towers, A. Kwiatkowski, J. Terry, J. U. Balis, G. D. Cola, T. Deleu, M. Goulão, A. Kallinteris, M. Krimmel, A. KG, R. Perez-Vicente, A. Pierré, S. Schulhoff, J. J. Tai, H. Tan, and O. G. Younis. Gymnasium: A standard interface for reinforcement learning environments, 2024. [40] P. Voigt and A. Von dem Bussche. The eu general data protection regulation (gdpr). A Practical Guide, 1st Ed., Cham: Springer International Publishing, 10(3152676):105555, 2017. [41] C. J. C. H. Watkins. Learning from delayed rewards. 1989. [42] H. Xu, T. Zhu, L. Zhang, W. Zhou, and P. S. Yu. Machine unlearning: A survey. ACM Comput. Surv., 56(1), aug 2023. [43] D. Ye, T. Zhu, C. Zhu, D. Wang, K. Gao, Z. Shi, S. Shen, W. Zhou, and M. Xue. Reinforcement unlearning. arXiv preprint arXiv:2312.15910, 2023.	-
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/98342	-
dc.description.abstract	隨著對用戶數據的日益依賴，機器反學習這一新興領域受到越來越多的關注。該領域旨在在不進行完整重新訓練的情況下，選擇性移除特定數據對機器學習模型的影響。儘管機器反學習在分類與生成模型中已有廣泛研究，但其在強化學習中的應用仍鮮有探討。強化學習因其序列決策的特性而帶來獨特挑戰。本文針對這些挑戰，將強化學習中的反學習定義為移除特定狀態下轉移資訊的影響，使得與這些狀態相關的環境對模型而言變得未被探索。我們提出了一個正式的的數學框架實現精確反學習，並改進了重新訓練策略，同時設計出一個高效的反學習演算法，該演算法在基於價值與基於策略函數的方法中皆融入了高斯噪音。實驗結果涵蓋離散與連續狀態空間，顯示出該方法具備有效的遺忘能力。所提出的演算法在顯著降低訓練時間的同時，始終可達到與黃金標準，重新訓練，相當的表現。此外，在應用於初始偏誤場景中，該方法亦顯示出優於現有基線的效果，驗證了其更廣泛的實用性。	zh_TW
dc.description.abstract	The growing reliance on user data has brought attention to the emerging field of machine unlearning, which focuses on selectively removing the influence of specific data (or groups of data) from machine learning models without requiring full re-training. While machine unlearning has been extensively studied in classification and generative models, its application to reinforcement learning remains largely unexplored. Reinforcement learning poses unique challenges due to its sequential decision-making nature. In this paper, we address these challenges by defining unlearning in reinforcement learning as the removal of information about transitions at specific states, rendering the environment related to those states unexplored for the agent. We propose a formal mathematical framework for exact unlearning, refine the re-training strategy, and introduce an efficient unlearning algorithm that incorporates Gaussian noise into both value-based and policy-based methods. Experimental results across discrete and continuous state spaces demonstrate effective unlearning performance. The proposed algorithm consistently matches the golden baseline of re-training while requiring less training time. Applications to the primacy bias further illustrate superior performance compared to an existing baseline, validating its broader practical applicability.	en
dc.description.provenance	Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2025-08-04T16:05:41Z No. of bitstreams: 0	en
dc.description.provenance	Made available in DSpace on 2025-08-04T16:05:41Z (GMT). No. of bitstreams: 0	en
dc.description.tableofcontents	摘要 i Abstract ii Contents iii List of Figures v List of Tables ix Chapter 1 Introduction 1 Chapter 2 Background and related works 5 2.1 Reinforcement learning . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.2 Machine unlearning . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.3 Machine unlearning and RL . . . . . . . . . . . . . . . . . . . . . . 9 Chapter 3 How is unlearning different in RL? 11 3.1 Unlearning target . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 3.2 Why state unlearning is unique in reinforcement learning . . . . . . . 12 3.3 Illustrations of unlearning in imitation learning settings . . . . . . . . 13 3.4 Implications of state unlearning . . . . . . . . . . . . . . . . . . . . 14 Chapter 4 The golden baseline: Re-training in RL 15 4.1 Conventional implementation . . . . . . . . . . . . . . . . . . . . . 15 4.2 State randomisation . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 Chapter 5 Our unlearning algorithm 19 5.1 General methodology . . . . . . . . . . . . . . . . . . . . . . . . . . 19 5.2 Off-policy value-based methods (DQN) . . . . . . . . . . . . . . . . 20 5.3 On-policy policy-based methods (PPO) . . . . . . . . . . . . . . . . 20 Chapter 6 Empirical studies 23 6.1 Tabular environments . . . . . . . . . . . . . . . . . . . . . . . . . . 23 6.1.1 Evaluation metric . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 6.1.2 Theoretical results . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 6.1.3 Empirical results: DQN and PPO . . . . . . . . . . . . . . . . . . . 25 6.1.4 Sanity check . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 6.2 MetaDrive continuous environment . . . . . . . . . . . . . . . . . . 28 6.2.1 Empirical results: PPO in MetaDrive . . . . . . . . . . . . . . . . . 30 6.3 Primacy bias setting . . . . . . . . . . . . . . . . . . . . . . . . . . 31 Chapter 7 Final Remark 33 References 35 Appendix A — The Impact Statement 41 A.1 The Impact Statement . . . . . . . . . . . . . . . . . . . . . . . . . 41 Appendix B — Appendix 43 B.1 Unlearning for reinforcement learning . . . . . . . . . . . . . . . . . 43 B.2 Omitted proofs and theoretical results . . . . . . . . . . . . . . . . . 45 B.3 Details on the experimental setup . . . . . . . . . . . . . . . . . . . 46 B.4 Additional results and experiments . . . . . . . . . . . . . . . . . . . 47	-
dc.language.iso	en	-
dc.subject	機器反學習	zh_TW
dc.subject	環境狀態反學習	zh_TW
dc.subject	強化學習	zh_TW
dc.subject	深度學習	zh_TW
dc.subject	機器學習	zh_TW
dc.subject	machine learning	en
dc.subject	reinforcement learning	en
dc.subject	machine unlearning	en
dc.subject	state unlearning	en
dc.subject	deep learning	en
dc.title	強化學習之反學習	zh_TW
dc.title	Towards Unlearning in Reinforcement Learning	en
dc.type	Thesis	-
dc.date.schoolyear	113-2	-
dc.description.degree	碩士	-
dc.contributor.oralexamcommittee	謝秉均;徐祥	zh_TW
dc.contributor.oralexamcommittee	Ping-Chun Hsieh;Hsiang Hsu	en
dc.subject.keyword	強化學習,機器反學習,機器學習,深度學習,環境狀態反學習,	zh_TW
dc.subject.keyword	reinforcement learning,machine unlearning,machine learning,deep learning,state unlearning,	en
dc.relation.page	56	-
dc.identifier.doi	10.6342/NTU202502402	-
dc.rights.note	未授權	-
dc.date.accepted	2025-08-01	-
dc.contributor.author-college	電機資訊學院	-
dc.contributor.author-dept	電信工程學研究所	-
dc.date.embargo-lift	N/A	-
顯示於系所單位：	電信工程學研究所

文件中的檔案：

檔案	大小	格式
ntu-113-2.pdf 未授權公開取用	23.31 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。