Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊工程學系
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/94700
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor張智星zh_TW
dc.contributor.advisorJyh-Shing Roger Jangen
dc.contributor.author林育辰zh_TW
dc.contributor.authorYu-Chen Linen
dc.date.accessioned2024-08-16T17:36:06Z-
dc.date.available2024-08-22-
dc.date.copyright2024-08-16-
dc.date.issued2024-
dc.date.submitted2024-08-06-
dc.identifier.citation[1] A. Balaguer, V. Benara, R. L. de Freitas Cunha, R. de M. Estevão Filho, T. Hendry, D. Holstein, et al. Rag vs fine-tuning: Pipelines, tradeoffs, and a case study on agriculture. arXiv preprint arXiv:2401.08406, 2024.
[2] R. A. Bradley and M. E. Terry. Rank analysis of incomplete block designs: I. the method of paired comparisons. Biometrika, 39(3/4):324–345, 1952.
[3] W.-L. Chiang, L. Zheng, Y. Sheng, A. N. Angelopoulos, T. Li, D. Li, et al. Chatbot arena: An open platform for evaluating llms by human preference. arXiv preprint arXiv:2403.04132, 2024.
[4] J. Dean and S. Ghemawat. Mapreduce: simplified data processing on large clusters. Communications of the ACM, 51(1):107–113, 2008.
[5] D. Dua, S. Gupta, S. Singh, and M. Gardner. Successive prompting for decomposing complex questions. arXiv preprint arXiv:2212.04092, 2022.
[6] A. Dubey, A. Jauhri, A. Pandey, A. Kadian, A. Al-Dahle, A. Letman, et al. The llama 3 herd of kids. arXiv preprint arXiv:2407.21783, 2024.
[7] D. Edge, H. Trinh, N. Cheng, J. Bradley, A. Cha, A. Mody, et al. From local to global: A graph rag approach to query-focused summarization. arXiv preprint arXiv:2404.16130, 2024.
[8] B. Efron. Bootstrap methods: another look at the jackknife. In Breakthroughs in statistics: Methodology and distribution, pages 569–593. Springer, 1992.
[9] A. Eliseev and D. Mazur. Fast inference of mixture-of-experts language models with offloading. arXiv preprint arXiv:2312.17238, 2023.
[10] A. Elo. The Rating of Chessplayers: Past and Present. Ishi Press International, 2008.
[11] Y. Gao, Y. Xiong, X. Gao, K. Jia, J. Pan, Y. Bi, et al. Retrieval-augmented generation for large language models: A survey. arXiv preprint arXiv:2312.10997, 2024.
[12] M. E. Glickman and A. C. Jones. Rating the chess rating system. CHANCE-BERLIN THEN NEW YORK-, 12:21–28, 1999.
[13] A. Gu and T. Dao. Mamba: Linear-time sequence modeling with selective state spaces. arXiv preprint arXiv:2312.00752, 2024.
[14] Z. He, H. Wu, X. Zhang, X. Yao, S. Zheng, H. Zheng, et al. Chateda: A large language model powered autonomous agent for eda. In 2023 ACM/IEEE 5th Workshop on Machine Learning for CAD (MLCAD), pages 1–6. IEEE, 2023.
[15] C.-Y. Hsieh, C.-L. Li, C.-K. Yeh, H. Nakhost, Y. Fujii, A. Ratner, et al. Distilling step-by-step! outperforming larger language models with less training data and smaller model sizes. arXiv preprint arXiv:2305.02301, 2023.
[16] D. Huang, Q. Bu, J. M. Zhang, M. Luck, and H. Cui. Agentcoder: Multi-agent code generation with iterative testing and optimization. arXiv preprint arXiv:2312.13000, 2024.
[17] A. Q. Jiang, A. Sablayrolles, A. Roux, A. Mensch, B. Savary, C. Bamford, et al. Mixtral of experts. arXiv preprint arXiv:2401.04088, 2024.
[18] X. Jiang, Y. Dong, L. Wang, Z. Fang, Q. Shang, G. Li, et al. Self-planning code generation with large language models. arXiv preprint arXiv:2303.06689, 2023.
[19] Z. Jiang, F. F. Xu, L. Gao, Z. Sun, Q. Liu, J. Dwivedi-Yu, et al. Active retrieval augmented generation. arXiv preprint arXiv:2305.06983, 2023.
[20] T. Kojima, S. S. Gu, M. Reid, Y. Matsuo, and Y. Iwasawa. Large language models are zero-shot reasoners. arXiv preprint arXiv:2205.11916, 2023.
[21] P. Lewis, E. Perez, A. Piktus, F. Petroni, V. Karpukhin, N. Goyal, et al. Retrieval-augmented generation for knowledge-intensive nlp tasks. arXiv preprint arXiv:2005.11401, 2021.
[22] C. Li, J. Liang, A. Zeng, X. Chen, K. Hausman, D. Sadigh, et al. Chain of code: Reasoning with a language model-augmented code emulator. arXiv preprint arXiv:2312.04474, 2023.
[23] C. Li, J. Wang, Y. Zhang, K. Zhu, W. Hou, J. Lian, et al. Large language models understand and can be enhanced by emotional stimuli. arXiv preprint arXiv:2307.11760, 2023.
[24] Y.-C. Lin, A. Kumar, N. Chang, W. Zhang, M. Zakir, R. Apte, et al. Novel pre-processing technique for data embedding in engineering code generation using large language model. In 1st IEEE International Workshop on LLM-Aided Design, 2024.
[25] Y.-C. Lin, A. Kumar, N. Chang, W. Zhang, M. Zakir, R. Apte, et al. Novel pre-processing technique for data embedding in engineering code generation using large language model. arXiv preprint arXiv:2311.16267, 2024.
[26] M. Liu, N. Pinckney, B. Khailany, and H. Ren. Verilogeval: Evaluating large language models for verilog code generation. arXiv preprint arXiv:2309.07544, 2023.
[27] S. Liu, W. Fang, Y. Lu, Q. Zhang, H. Zhang, and Z. Xie. Rtlcoder: Outperforming gpt-3.5 in design rtl generation with our open-source dataset and lightweight solution. In 1st IEEE International Workshop on LLM-Aided Design, 2024.
[28] S. Minaee, T. Mikolov, N. Nikzad, M. Chengahlu, R. Socher, X. Amatriain, et al. Large language models: A survey. arXiv preprint arXiv:2402.06196, 2024.
[29] A. Mitra, L. D. Corro, S. Mahajan, A. Codas, C. Simoes, S. Agarwal, et al. Orca 2: Teaching small language models how to reason. arXiv preprint arXiv:2311.11045, 2023.
[30] E. Nijkamp, B. Pang, H. Hayashi, L. Tu, H. Wang, Y. Zhou, et al. Codegen: An open large language model for code with multi-turn program synthesis. arXiv preprint arXiv:2203.13474, 2023.
[31] OpenAI, J. Achiam, S. Adler, S. Agarwal, L. Ahmad, I. Akkaya, et al. Gpt-4 technical report. arXiv preprint arXiv:2303.08774, 2023.
[32] B. Peng, C. Li, P. He, M. Galley, and J. Gao. Instruction tuning with gpt-4. arXiv preprint arXiv:2304.03277, 2023.
[33] M. Schäfer, S. Nadi, A. Eghbali, and F. Tip. An empirical evaluation of using large language models for automated unit test generation. arXiv preprint arXiv:2302.06527, 2023.
[34] G. Team, M. Riviere, S. Pathak, P. G. Sessa, C. Hardin, S. Bhupatiraju, et al. Gemma 2: Improving open language models at a practical size. arXiv preprint arXiv:2408.00118, 2024.
[35] S. Thakur, B. Ahmad, H. Pearce, B. Tan, B. Dolan-Gavitt, R. Karri, et al. Verigen: A large language model for verilog code generation. arXiv preprint arXiv:2308.00708, 2023.
[36] H. Touvron, L. Martin, K. Stone, P. Albert, A. Almahairi, Y. Babaei, et al. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288, 2023.
[37] Y.-D. Tsai, M. Liu, and H. Ren. Rtlfixer: Automatically fixing rtl syntax errors with large language models. In Proceedings of the 61th ACM/IEEE Design Automation Conference (DAC’24), 2024.
[38] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, et al. Attention is all you need. Advances in neural information processing systems, 30, 2017.
[39] G. Xiao, Y. Tian, B. Chen, S. Han, and M. Lewis. Efficient streaming language models with attention sinks. arXiv preprint arXiv:2309.17453, 2023.
[40] S. Yao, J. Zhao, D. Yu, N. Du, I. Shafran, K. Narasimhan, et al. React: Synergizing reasoning and acting in language models. arXiv preprint arXiv:2210.03629, 2023.
[41] W. X. Zhao, K. Zhou, J. Li, T. Tang, X. Wang, Y. Hou, et al. A survey of large language models. arXiv preprint arXiv:2303.18223, 2023.
[42] H. S. Zheng, S. Mishra, X. Chen, H.-T. Cheng, E. H. Chi, Q. V. Le, et al. Take a step back: Evoking reasoning via abstraction in large language models. arXiv preprint arXiv:2310.06117, 2023.
[43] L. Zheng, W.-L. Chiang, Y. Sheng, S. Zhuang, Z. Wu, Y. Zhuang, et al. Judging llm-as-a-judge with mt-bench and chatbot arena. arXiv preprint arXiv:2306.05685, 2023.
-
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/94700-
dc.description.abstract本文提出五項基於大型語言模型的程式碼生成組件,用於特定領域的腳本生成,並評估其有效性。貢獻:(i) 基於大型語言模型的語義分割 (Semantic Splitter) 以及資料翻新 (Data Renovation) 組件 以改進資料語義的表示;(ii) 運用大型語言模型重構以產生高品質程式碼的組件 Script Augmentation;(iii) 提出提示技術 隱形知識擴展與思考 (Implicit Knowledge Expansion and Contemplation, IKEC) 組件;(iv) 提出程式碼生成的流程,以五項組件漸進式生成工程模擬軟體 RedHawk-SC 的程式碼;(v) 評估不同參考資料型態之於程式碼生成的有效性。零樣本連鎖思維 (Zero-shot Chain-of-Thought, ZCoT) 為有效的提示技術,包括在五項建設性組件中,以利評估其餘組件之有效性。我們邀請 28 位領域專家透過競技場式評估蒐集 187 份成對比較結果以驗證前述組件之有效性,其中最佳組件於工程軟體 RedHawk-SC 上 MapReduce 程式碼生成表現達到 21.26% 的勝率提升,相較零樣本連鎖思維 6.68% 勝率提升顯著許多。zh_TW
dc.description.abstractWe propose five constructive components based on Large Language Models (LLMs) for domain-specific code generation and evaluate their effectiveness. The contributions are (i) Semantic splitter and data renovation for improved data semantic representation; (ii) Script augmentation for enhanced code quality; (iii) Implicit Knowledge Expansion and Contemplation (IKEC) prompting technique; (iv) A workflow using hierarchical generation for scripts in the engineering software RedHawk-SC; (v) An evaluation of different reference data types for code generation. We invited 28 domain experts to conduct an arena-style evaluation, collecting 187 paired comparisons to validate the effectiveness of those components. The best component achieved a 21.26% win rate improvement in MapReduce code generation performance for RedHawk-SC, significantly outperforming the 6.68% win rate improvement of the Zero-shot Chain-of-Thought (ZCoT).en
dc.description.provenanceSubmitted by admin ntu (admin@lib.ntu.edu.tw) on 2024-08-16T17:36:06Z
No. of bitstreams: 0
en
dc.description.provenanceMade available in DSpace on 2024-08-16T17:36:06Z (GMT). No. of bitstreams: 0en
dc.description.tableofcontentsVerification Letter from the Oral Examination Committee - i
Acknowledgements -------------------------- iii
摘要 ------------------------------------- v
Abstract ---------------------------------- vii
Contents --------------------------------- ix
List of Figures -------------------------- xv
List of Tables --------------------------- xix
Chapter 1 Introduction ------------------- 1
1.1 Research Topic ------------------------ 1
1.2 Motivation ---------------------------- 2
1.3 Challenges ---------------------------- 3
1.4 Contributions ------------------------- 5
1.5 Methodology Overview ------------------ 6
1.6 Overview of Engineering Tools Applicable to the Code Generator - 8
1.6.1 RedHawk-SC ----------------------- 8
1.6.1.1 RedHawk vs RedHawk-SC ------ 8
1.6.1.2 Technical Architecture ----- 9
1.6.2 MapReduce ----------------------- 10
1.7 Chapter Overview ---------------------- 11
Chapter 2 Literature Review -------------- 13
2.1 Large Language Models (LLMs) ---------- 13
2.1.1 Various Large Language Models ---- 14
2.1.2 Retrieval-Augmented Generation (RAG) - 15
2.1.3 RAG vs Fine-tuning --------------- 16
2.1.4 Enhancing Input Tokens ----------- 17
2.1.5 Prompt Techniques and Mechanisms - 17
2.1.6 Distillation --------------------- 18
2.2 Related Work in Code Generation ------- 18
2.3 Evaluation ---------------------------- 20
2.3.1 Chatbot Arena -------------------- 20
2.3.2 Elo ------------------------------ 21
2.3.3 Bradley-Terry Model -------------- 22
2.3.4 Bootstrap ------------------------ 23
Chapter 3 Methodology -------------------- 25
3.1 Problem Definition -------------------- 25
3.1.1 Technical Bottlenecks and Issues in Existing RAG Technologies - 26
3.1.2 Proposed Methods ----------------- 27
3.2 Method Description -------------------- 28
3.2.1 Semantic Splitter ---------------- 29
3.2.2 Data Renovation ------------------ 32
3.2.3 Script Augmentation -------------- 33
3.2.4 Implicit Knowledge Expansion and Contemplation (IKEC) - 35
3.2.5 Zero-shot Chain-of-Thought (ZCoT) - 35
3.2.6 Code Generation Pipeline --------- 36
3.2.6.1 Task Planning -------------- 36
3.2.6.2 Script Generation ---------- 36
3.3 Experimental Methods ------------------ 40
3.3.1 Component and Script Generation -- 40
3.3.2 Combinations --------------------- 41
3.3.3 Selection Strategy for Combinations - 43
3.3.4 Arena-style Evaluation ----------- 44
3.3.4.1 Explanation and Examples --- 45
3.4 Methodology Review -------------------- 46
Chapter 4 Datasets and Experimental Setup - 47
4.1 Application Scenarios ----------------- 47
4.2 Dataset ------------------------------- 48
4.2.1 RAG Reference -------------------- 48
4.2.2 Ansys-RHSC-20 -------------------- 49
4.3 Evaluation Metrics -------------------- 50
4.4 Machine Environment ------------------- 50
4.5 Experimental Environment -------------- 51
4.6 Experimental Parameters --------------- 52
4.7 Roadmap of Experiments ---------------- 53
Chapter 5 Experiments and Analysis -------- 55
5.1 Experiment 1: Arena-style Evaluation of Component Effectiveness in Code Generation - 55
5.1.1 Arena Information ---------------- 56
5.1.2 Pairwise Voting ------------------ 58
5.1.3 Pairwise Voting - Even Sample ---- 61
5.1.4 Elo Rating ----------------------- 62
5.1.5 Elo Rating - Even Sample --------- 62
5.1.6 Bradley-Terry Model -------------- 63
5.1.7 Bradley-Terry Model - Even Sample - 64
5.1.8 Ablation Study ------------------- 69
5.1.8.1 Semantic Splitter ---------- 69
5.1.8.2 Data Renovation ------------ 70
5.1.8.3 Script Augmentation -------- 71
5.1.8.4 Implicit Knowledge Expansion and Contemplation (IKEC) - 72
5.1.8.5 Zero-shot Chain-of-Thought (ZCoT) - 73
5.1.9 Comprehensive Explanation -------- 74
5.2 Experiment 2: Comparison of RAG Data Source Proportions in Different Component Combinations - 76
5.2.1 Data Preprocessing --------------- 76
5.2.2 Prompt Techniques ---------------- 79
Chapter 6 Conclusions --------------------- 81
6.1 Summary of Findings ------------------- 82
6.2 Future Prospects ---------------------- 82
References --------------------------------- 85
-
dc.language.isoen-
dc.subject自然語言處理zh_TW
dc.subject大型語言模型zh_TW
dc.subject程式碼生成zh_TW
dc.subject資料嵌入zh_TW
dc.subject資料前處理zh_TW
dc.subject語義分割zh_TW
dc.subject資料翻新zh_TW
dc.subject腳本擴增zh_TW
dc.subject提示技術zh_TW
dc.subjectData Renovationen
dc.subjectNatural Language Processing (NLP)en
dc.subjectSemantic Splitteren
dc.subjectLarge Language Models (LLMs)en
dc.subjectCode Generationen
dc.subjectData Embeddingen
dc.subjectData Preprocessingen
dc.subjectPrompt Techniquesen
dc.subjectScript Augmentationen
dc.title基於大型語言模型的五項程式碼生成組件及其評估zh_TW
dc.titleProposition and Evaluation of Five Constructive Components for Code Generation via Large Language Modelsen
dc.typeThesis-
dc.date.schoolyear112-2-
dc.description.degree碩士-
dc.contributor.oralexamcommittee李宏毅;張鴻嘉zh_TW
dc.contributor.oralexamcommitteeHung-yi Lee;Norman Changen
dc.subject.keyword自然語言處理,大型語言模型,程式碼生成,資料嵌入,資料前處理,語義分割,資料翻新,腳本擴增,提示技術,zh_TW
dc.subject.keywordNatural Language Processing (NLP),Large Language Models (LLMs),Code Generation,Data Embedding,Data Preprocessing,Semantic Splitter,Data Renovation,Script Augmentation,Prompt Techniques,en
dc.relation.page90-
dc.identifier.doi10.6342/NTU202402417-
dc.rights.note同意授權(全球公開)-
dc.date.accepted2024-08-09-
dc.contributor.author-college電機資訊學院-
dc.contributor.author-dept資訊工程學系-
顯示於系所單位:資訊工程學系

文件中的檔案:
檔案 大小格式 
ntu-112-2.pdf3.89 MBAdobe PDF檢視/開啟
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved