使用神經正切核解釋模型重程式化

鍾明宇; Ming-Yu Chung

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/93372

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	郭斯彥	zh_TW
dc.contributor.advisor	Sy-Yen Kuo	en
dc.contributor.author	鍾明宇	zh_TW
dc.contributor.author	Ming-Yu Chung	en
dc.date.accessioned	2024-07-30T16:11:21Z	-
dc.date.available	2024-07-31	-
dc.date.copyright	2024-07-30	-
dc.date.issued	2024	-
dc.date.submitted	2024-07-26	-
dc.identifier.citation	N. Aronszajn. Theory of reproducing kernels. Transactions of the American mathematical society, 68(3):337–404, 1950. S. Arora, S. Du, W. Hu, Z. Li, and R. Wang. Fine-grained analysis of optimization and generalization for overparameterized two-layer neural networks. In International Conference on Machine Learning (ICML), pages 322–332, 2019. H. Bahng, A. Jahanian, S. Sankaranarayanan, and P. Isola. Exploring visual prompts for adapting large-scale models. arXiv preprint arXiv:2203.17274, 2022. A. Berlinet and C. Thomas-Agnan. Reproducing kernel Hilbert spaces in probability and statistics. Springer Science & Business Media, 2011. T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell, et al. Language models are few-shot learners. Advances in neural information processing systems (NeurIPS), 33:1877–1901, 2020. A. Chen, Y. Yao, P.-Y. Chen, Y. Zhang, and S. Liu. Understanding and improving visual prompting: A label-mapping perspective. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 19133–19143, 2023. P.-Y. Chen. Model reprogramming: Resource-efficient cross-domain machine learning. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), volume 38, pages 22584–22591, 2024. Y. Chen, W. Huang, H. Wang, C. Loh, A. Srivastava, L. Nguyen, and L. Weng. Analyz- ing generalization of neural networks through loss path kernels. Advances in Neural Information Processing Systems (NeurIPS), 36, 2024. M.-Y. Chung, S.-Y. Chou, C.-M. Yu, P.-Y. Chen, S.-Y. Kuo, and T.-Y. Ho. Rethinking backdoor attacks on dataset distillation: A kernel method perspective. In The Twelfth International Conference on Learning Representations (ICLR), 2023. S. Du, J. Lee, H. Li, L. Wang, and X. Zhai. Gradient descent finds global minima of deep neural networks. In International conference on machine learning (ICML), pages 1675–1685, 2019. G. F. Elsayed, I. Goodfellow, and J. Sohl-Dickstein. Adversarial reprogramming of neural networks. arXiv preprint arXiv:1806.11146, 2018. M. Englert and R. Lazic. Adversarial reprogramming revisited. Advances in Neural Information Processing Systems, 35:28588–28600, 2022. B. Ghojogh, A. Ghodsi, F. Karray, and M. Crowley. Reproducing kernel hilbert space, mercer’s theorem, eigenfunctions, nyström method, and use of kernels in machine learn- ing: Tutorial and survey. arXiv preprint arXiv:2106.08443, 2021. K. Hambardzumyan, H. Khachatrian, and J. May. Warp: Word-level adversarial reprogramming. arXiv preprint arXiv:2101.00121, 2021. B. He, B. Lakshminarayanan, and Y. W. Teh. Bayesian deep ensembles via the neural tangent kernel. Advances in neural information processing systems (NeurIPS), 33: 1010–1022, 2020. J. Huang and H.-T. Yau. Dynamics of deep neural networks and neural tangent hierarchy. In International conference on machine learning (ICML), pages 4542–4551, 2020. A. Jacot, F. Gabriel, and C. Hongler. Neural tangent kernel: Convergence and generalization in neural networks. Advances in neural information processing systems (NeurIPS), 31, 2018. M. Jin, S. Wang, L. Ma, Z. Chu, J. Y. Zhang, X. Shi, P.-Y. Chen, Y. Liang, Y.-F. Li, S. Pan, et al. Time-llm: Time series forecasting by reprogramming large language models. arXiv preprint arXiv:2310.01728, 2023. G. Kimeldorf and G. Wahba. Some results on tchebycheffian spline functions. Journal of mathematical analysis and applications, 33(1):82–95, 1971. J. Lee, L. Xiao, S. Schoenholz, Y. Bahri, R. Novak, J. Sohl-Dickstein, and J. Pennington. Wide neural networks of any depth evolve as linear models under gradient descent. Advances in neural information processing systems (NeurIPS), 32, 2019. Y. Li, Y.-L. Tsai, C.-M. Yu, P.-Y. Chen, and X. Ren. Exploring the benefits of vi- sual prompting in differential privacy. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 5158–5167, 2023. I. Melnyk, V. Chenthamarakshan, P.-Y. Chen, P. Das, A. Dhurandhar, I. Padhi, and D. Das. Reprogramming pretrained language models for antibody sequence infilling. In International Conference on Machine Learning, pages 24398–24419. PMLR, 2023. M. Mohri, A. Rostamizadeh, and A. Talwalkar. Foundations of Machine Learning. The MIT Press, 2012. ISBN 026201825X. T. Nguyen, Z. Chen, and J. Lee. Dataset meta-learning from kernel ridge-regression. In International Conference on Learning Representations (ICLR), 2020. B. Ronen, D. Jacobs, Y. Kasten, and S. Kritchman. The convergence rate of neural net- works for learned functions of different frequencies. Advances in Neural Information Processing Systems (NeurIPS), 32, 2019. Y.-Y. Tsai, P.-Y. Chen, and T.-Y. Ho. Transfer learning without knowing: Reprogram- ming black-box machine learning models with scarce data and limited resources. In International Conference on Machine Learning, pages 9614–9624. PMLR, 2020. H.-A. Tsao, L. Hsiung, P.-Y. Chen, S. Liu, and T.-Y. Ho. Autovp: An automated visual prompting framework and benchmark. In The Twelfth International Conference on Learning Representations, 2023. R. Vinod, P.-Y. Chen, and P. Das. Reprogramming pretrained language models for protein sequence representation learning. arXiv preprint arXiv:2301.02120, 2023. C.-H. H. Yang, Y.-Y. Tsai, and P.-Y. Chen. Voice2series: Reprogramming acoustic models for time series classification. In International conference on machine learning (ICML), pages 11808–11819, 2021. C.-H. H. Yang, B. Li, Y. Zhang, N. Chen, R. Prabhavalkar, T. N. Sainath, and T. Strohman. From english to more languages: Parameter-efficient model reprogramming for cross- lingual speech recognition. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2023.	-
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/93372	-
dc.description.abstract	模型重程式化 (MR) [Chen, 2024] 是一種能夠有效率地使用計算資源的微調方法，用於在不修改預訓練模型的權重的情況下，將預訓練的模型重新用於解決新任務。本文通過神經正切核 (NTK) 的視角探討 MR，旨在增強對這一機器學習技術的理解和效能。通過深入研究 MR 的理論基礎，並利用 NTK 框架的見解，本研究闡明了促成 MR 算法成功的核心機制。借助 NTK 理論，這篇論文提供了新的見解和解釋，並且加深對模型重程式化的理解。	zh_TW
dc.description.abstract	Model reprogramming (MR) [Chen, 2024] is a resource-efficient fine-tuning method for repurposing a pretrained machine learning model to solve new tasks without modifying the pretrained weights. This paper explores MR through the lens of the Neural Tangent Kernel (NTK), aiming to enhance the comprehension and efficacy of this machine learning technique. By delving into the theoretical underpinnings of MR and utilizing insights from the NTK framework, this research elucidates the core mechanisms that contribute to the success of MR algorithms. Drawing on NTK theory, this work offers novel insights and explanations to deepen understanding of the model reprogramming process.	en
dc.description.provenance	Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2024-07-30T16:11:21Z No. of bitstreams: 0	en
dc.description.provenance	Made available in DSpace on 2024-07-30T16:11:21Z (GMT). No. of bitstreams: 0	en
dc.description.tableofcontents	Acknowledgements i 摘要 ii Abstract iii Contents iv List of Figures vi List of Tables vii Denotation viii Chapter 1 Introduction 1 Chapter 2 Preliminaries and Related Works 4 2.1 Model Reprogramming 4 2.2 Reproducing Kernel Hilbert Space 6 2.3 Neural Tangent Kernel 7 Chapter 3 Theoretical Framework 10 3.1 Empirical Risk of MR 11 3.3 Generalization Gap of MR 13 3.3 Summary and Remark 15 Chapter 4 Relation between NTK, Target Distribution and Source Distribution 17 Chapter 5 Numerical Results 27 5.1 Experimental Setting 27 5.2 Experimental Compute Resources 29 5.3 Training of the Source Model 29 5.4 Training of the Target Model 29 5.5 Experimental Results 30 5.6 Experiments on Large Scale Image 34 Chapter 6 Conclusion 37 References 38	-
dc.language.iso	en	-
dc.subject	模型重程式化	zh_TW
dc.subject	神經正切核	zh_TW
dc.subject	再生核希爾伯特空間	zh_TW
dc.subject	Neural Tangent Kernel	en
dc.subject	Model Reprogramming	en
dc.subject	Reproducing Kernel Hilbert Space	en
dc.title	使用神經正切核解釋模型重程式化	zh_TW
dc.title	Model Reprogramming Demystified: A Neural Tangent Kernel Perspective	en
dc.type	Thesis	-
dc.date.schoolyear	112-2	-
dc.description.degree	碩士	-
dc.contributor.oralexamcommittee	游家牧;顏嗣鈞;雷欽隆;陳英一	zh_TW
dc.contributor.oralexamcommittee	Chia-Mu Yu;Hsu-chun Yen;Chin-Laung Lei;Ing-Yi Chen	en
dc.subject.keyword	模型重程式化,神經正切核,再生核希爾伯特空間,	zh_TW
dc.subject.keyword	Model Reprogramming,Neural Tangent Kernel,Reproducing Kernel Hilbert Space,	en
dc.relation.page	41	-
dc.identifier.doi	10.6342/NTU202401278	-
dc.rights.note	同意授權(限校園內公開)	-
dc.date.accepted	2024-07-29	-
dc.contributor.author-college	電機資訊學院	-
dc.contributor.author-dept	電機工程學系	-
dc.date.embargo-lift	2029-07-25	-
顯示於系所單位：	電機工程學系

文件中的檔案：

檔案	大小	格式
ntu-112-2.pdf 未授權公開取用	4.54 MB	Adobe PDF	檢視/開啟

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。