上下文學習優化方法之比較：靜態與動態策略分析

林承濬; Cheng-Chun Lin

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/98457

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	林守德	zh_TW
dc.contributor.advisor	Shou-De Lin	en
dc.contributor.author	林承濬	zh_TW
dc.contributor.author	Cheng-Chun Lin	en
dc.date.accessioned	2025-08-14T16:11:38Z	-
dc.date.available	2025-08-15	-
dc.date.copyright	2025-08-14	-
dc.date.issued	2025	-
dc.date.submitted	2025-08-01	-
dc.identifier.citation	[1] S. An, B. Zhou, Z. Lin, Q. Fu, B. Chen, N. Zheng, W. Chen, and J.-G. Lou. Skill-based few-shot selection for in-context learning. arXiv preprint arXiv:2305.14210, 2023. [2] T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, A. Neelakan-tan, P. Shyam, G. Sastry, A. Askell, et al. Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901, 2020. [3] J. Carbonell and J. Goldstein. The use of mmr, diversity-based reranking for reordering documents and producing summaries. In Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’98, page 335– 336, New York, NY, USA, 1998. Association for Computing Machinery. [4] K. Cobbe, V. Kosaraju, M. Bavarian, M. Chen, H. Jun, L. Kaiser, M. Plappert, J. Tworek, J. Hilton, R. Nakano, et al. Training verifiers to solve math word problems. arXiv preprint arXiv:2110.14168, 2021. [5] M. Deng, J. Wang, C.-P. Hsieh, Y. Wang, H. Guo, T. Shu, M. Song, E. P. Xing, and Z. Hu. Rlprompt: Optimizing discrete text prompts with reinforcement learning. arXiv preprint arXiv:2205.12548, 2022. [6] Y. Gao, Y. Xiong, X. Gao, K. Jia, J. Pan, Y. Bi, Y. Dai, J. Sun, H. Wang, and H. Wang. Retrieval-augmented generation for large language models: A survey. arXiv preprint arXiv:2312.10997, 2, 2023. [7] D. Hendrycks, C. Burns, S. Basart, A. Zou, M. Mazeika, D. Song, and J. Steinhardt. Measuring massive multitask language understanding. Proceedings of the International Conference on Learning Representations (ICLR), 2021. [8] O. Khattab, A. Singhvi, P. Maheshwari, Z. Zhang, K. Santhanam, S. Vardhamanan, S. Haq, A. Sharma, T. T. Joshi, H. Moazam, et al. Dspy: Compiling declarative language model calls into self-improving pipelines. arXiv preprint arXiv:2310.03714, 2023. [9] H. J. Kim, H. Cho, J. Kim, T. Kim, K. M. Yoo, and S. goo Lee. Self-generated in-context learning: Leveraging auto-regressive language models as a demonstration generator, 2022. [10] T. Kojima, S. S. Gu, M. Reid, Y. Matsuo, and Y. Iwasawa. Large language models are zero-shot reasoners. Advances in neural information processing systems, 35:22199–22213, 2022. [11] X. L. Li and P. Liang. Prefix-tuning: Optimizing continuous prompts for generation. arXiv preprint arXiv:2101.00190, 2021. [12] J. Liu, D. Shen, Y. Zhang, B. Dolan, L. Carin, and W. Chen. What makes good in-context examples for gpt-3? DeeLIO 2022, page 100, 2022. [13] X. Liu, Y. Zheng, Z. Du, M. Ding, Y. Qian, Z. Yang, and J. Tang. Gpt understands, too. AI Open, 5:208–215, 2024. [14] Y. Liu, J. Liu, X. Shi, Q. Cheng, Y. Huang, and W. Lu. Let’s learn step by step: Enhancing in-context learning ability with curriculum learning. arXiv preprint arXiv:2402.10738, 2024. [15] Y. Lu, M. Bartolo, A. Moore, S. Riedel, and P. Stenetorp. Fantastically ordered prompts and where to find them: Overcoming few-shot prompt order sensitivity. arXiv preprint arXiv:2104.08786, 2021. [16] S. Min, X. Lyu, A. Holtzman, M. Artetxe, M. Lewis, H. Hajishirzi, and L. Zettlemoyer. Rethinking the role of demonstrations: What makes in-context learning work? arXiv preprint arXiv:2202.12837, 2022. [17] R. Naik, V. Chandrasekaran, M. Yuksekgonul, H. Palangi, and B. Nushi. Diversity of thought improves reasoning abilities of large language models. arXiv preprint arXiv:2310.07088, 2023. [18] K. Opsahl-Ong, M. J. Ryan, J. Purtell, D. Broman, C. Potts, M. Zaharia, and O. Khattab. Optimizing instructions and demonstrations for multi-stage language model programs. arXiv preprint arXiv:2406.11695, 2024. [19] R. Pryzant, D. Iter, J. Li, Y. T. Lee, C. Zhu, and M. Zeng. Automatic prompt optimization with” gradient descent” and beam search. arXiv preprint arXiv:2305.03495, 2023. [20] C. Qin, A. Zhang, C. Chen, A. Dagar, and W. Ye. In-context learning with iterative demonstration selection. arXiv preprint arXiv:2310.09881, 2023. [21] P. Rajpurkar. Squad: 100,000+ questions for machine comprehension of text. arXiv preprint arXiv:1606.05250, 2016. [22] N. Reimers and I. Gurevych. Sentence-bert: Sentence embeddings using siamese bert-networks, 2019. [23] W. Shi, J. Michael, S. Gururangan, and L. Zettlemoyer. knn-prompt: Nearest neighbor zero-shot inference. arXiv preprint arXiv:2205.13792, 2022. [24] T. Shin, Y. Razeghi, R. L. Logan IV, E. Wallace, and S. Singh. Autoprompt: Eliciting knowledge from language models with automatically generated prompts. arXiv preprint arXiv:2010.15980, 2020. [25] D. Soylu, C. Potts, and O. Khattab. Fine-tuning and prompt optimization: Two great steps that work better together. arXiv preprint arXiv:2407.10930, 2024. [26] H. Sun, X. Li, Y. Xu, Y. Homma, Q. Cao, M. Wu, J. Jiao, and D. Charles. Autohint: Automatic prompt optimization with hint generation. arXiv preprint arXiv:2307.07415, 2023. [27] M. Suzgun, N. Scales, N. Schärli, S. Gehrmann, Y. Tay, H. W. Chung, A. Chowdhery, Q. V. Le, E. H. Chi, D. Zhou, , and J. Wei. Challenging big-bench tasks and whether chain-of-thought can solve them. arXiv preprint arXiv:2210.09261, 2022. [28] A. F. Tchango, R. Goel, Z. Wen, J. Martel, and J. Ghosn. Ddxplus: A new dataset for automatic medical diagnosis, 2022. [29] S. Wang, C.-H. H. Yang, J. Wu, and C. Zhang. Bayesian example selection improves in-context learning for speech, text, and visual modalities. arXiv preprint arXiv:2404.14716, 2024. [30] J. Wei, X. Wang, D. Schuurmans, M. Bosma, B. Ichter, F. Xia, E. Chi, Q. Le, and D. Zhou. Chain-of-thought prompting elicits reasoning in large language models, 2023. [31] Z. Wei, W.-L. Chen, and Y. Meng. Instructrag: Instructing retrieval-augmented generation via self-synthesized rationales. arXiv preprint arXiv:2406.13629, 2024. [32] O. Weller, M. Marone, N. Weir, D. Lawrie, D. Khashabi, and B. Van Durme. ”according to...”: Prompting language models improves quoting from pre-training data. arXiv preprint arXiv:2305.13252, 2023. [33] B. Xu, Q. Wang, Z. Mao, Y. Lyu, Q. She, and Y. Zhang. k nn prompting: Beyond-context learning with calibration-free nearest neighbor inference. arXiv preprint arXiv:2303.13824, 2023. [34] C. Yang, X. Wang, Y. Lu, H. Liu, Q. V. Le, D. Zhou, and X. Chen. Large language models as optimizers, 2024. [35] X. Ye, S. Iyer, A. Celikyilmaz, V. Stoyanov, G. Durrett, and R. Pasunuru. Complementary explanations for effective in-context learning. arXiv preprint arXiv:2211.13892, 2022. [36] M. Yuksekgonul, F. Bianchi, J. Boen, S. Liu, Z. Huang, C. Guestrin, and J. Zou. Textgrad: Automatic” differentiation” via text. arXiv preprint arXiv:2406.07496, 2024. [37] T. Zhang, X. Wang, D. Zhou, D. Schuurmans, and J. E. Gonzalez. Tempera: Test-time prompting via reinforcement learning. arXiv preprint arXiv:2211.11890, 2022. [38] X. Zhang, J. Zhao, and Y. LeCun. Character-level convolutional networks for text classification. Advances in neural information processing systems, 28, 2015. [39] Z. Zhang, A. Zhang, M. Li, and A. Smola. Automatic chain of thought prompting in large language models, 2022. [40] Z. Zhao, E. Wallace, S. Feng, D. Klein, and S. Singh. Calibrate before use: Improving few-shot performance of language models. In International conference on machine learning, pages 12697–12706. PMLR, 2021. [41] Y. Zhou, A. I. Muresanu, Z. Han, K. Paster, S. Pitis, H. Chan, and J. Ba. Large language models are human-level prompt engineers. arXiv preprint arXiv:2211.01910, 2022.	-
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/98457	-
dc.description.abstract	在眾多自動化提示詞優化的方法中，指令的調整與範例的使用已經展現出強大的效果。本研究將優化範例的方法分為兩大類：靜態方法，即對所有測試樣本都使用一組固定的範例；以及動態方法，會根據測試樣本調整所使用的範例。儘管這兩種方法都在解決相同的問題，相關的系統性比較研究仍相對有限。因此，本研究針對具有代表性的靜態與動態方法，在多個資料集和實務情境中進行比較，以展示兩種方法間的差異。結果顯示動態方法在多數情境中表現優於靜態方法，但不同策略的效果仍會依據資料特性而異。	zh_TW
dc.description.abstract	Among various automatic prompt optimization approaches, instruction tuning and the use of exemplars have shown strong effectiveness. We categorize exemplar optimization methods into two paradigms: static, which uses a fixed set of samples across all test samples, and dynamic, which adapts exemplars to the input. Despite addressing the same challenge, little attention has been paid to a systematic comparison. In this paper, we conduct an empirical study that compares representative static and dynamic methods across diverse benchmark datasets and real-world scenarios to demonstrate the difference in these two directions. Our results show that dynamic methods often outperform static ones, though effectiveness in different strategies varies depending on data characteristics.	en
dc.description.provenance	Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2025-08-14T16:11:38Z No. of bitstreams: 0	en
dc.description.provenance	Made available in DSpace on 2025-08-14T16:11:38Z (GMT). No. of bitstreams: 0	en
dc.description.tableofcontents	口試委員會審定書 i 誌謝 ii 摘要 iii Abstract iv Contents v List of Figures vii List of Tables viii Chapter 1 Introduction 1 Chapter 2 Related Work 4 2.1 Automatic Prompt Optimization . . . . . . . . . . . . . . . . . . . . 4 2.1.1 Instruction Tuning . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.1.2 Exemplar Optimization . . . . . . . . . . . . . . . . . . . . . . . . 5 2.2 Retrieval-Augmented Generation (RAG) . . . . . . . . . . . . . . . 6 Chapter 3 Experiment Settings 7 3.1 Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 3.2 Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 3.3 Optimization Methods . . . . . . . . . . . . . . . . . . . . . . . . . 8 Chapter 4 Results and Insights 11 4.1 Average Performance Results . . . . . . . . . . . . . . . . . . . . . 11 4.1.1 Observation on AGNews . . . . . . . . . . . . . . . . . . . . . . . 16 4.1.2 Observation on the number of exemplars . . . . . . . . . . . . . . . 16 4.1.3 Observation on exemplars overlap and importance . . . . . . . . . . 18 4.2 Insight 1: Diverse Distribution . . . . . . . . . . . . . . . . . . . . . 21 4.3 Insight 2: Skewed Distribution . . . . . . . . . . . . . . . . . . . . . 23 4.4 Insight 3: Sparse Distribution . . . . . . . . . . . . . . . . . . . . . 25 Chapter 5 Conclusion 28 References 29	-
dc.language.iso	en	-
dc.subject	自動化提示詞優化	zh_TW
dc.subject	大型語言模型	zh_TW
dc.subject	範例優化	zh_TW
dc.subject	上下文學習	zh_TW
dc.subject	Automatic Prompt Optimization	en
dc.subject	In-Context Learning	en
dc.subject	Large Language Models	en
dc.subject	Exemplar Optimization	en
dc.title	上下文學習優化方法之比較：靜態與動態策略分析	zh_TW
dc.title	Optimization Methods for In-Context Learning: Comparing Static and Dynamic Strategies	en
dc.type	Thesis	-
dc.date.schoolyear	113-2	-
dc.description.degree	碩士	-
dc.contributor.oralexamcommittee	李政德;廖耿德	zh_TW
dc.contributor.oralexamcommittee	Cheng-Te Li;Keng-Te Liao	en
dc.subject.keyword	大型語言模型,自動化提示詞優化,上下文學習,範例優化,	zh_TW
dc.subject.keyword	Large Language Models,Automatic Prompt Optimization,In-Context Learning,Exemplar Optimization,	en
dc.relation.page	34	-
dc.identifier.doi	10.6342/NTU202502349	-
dc.rights.note	同意授權(限校園內公開)	-
dc.date.accepted	2025-08-05	-
dc.contributor.author-college	電機資訊學院	-
dc.contributor.author-dept	資訊工程學系	-
dc.date.embargo-lift	2025-08-15	-
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
ntu-113-2.pdf 授權僅限NTU校內IP使用（校園外請利用VPN校外連線服務）	1.27 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。