Skip navigation

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets

Learn More
DSpace logo
English
中文
  • Browse
    • Communities
      & Collections
    • Publication Year
    • Author
    • Title
    • Subject
    • Advisor
  • Search TDR
  • Rights Q&A
    • My Page
    • Receive email
      updates
    • Edit Profile
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊工程學系
Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/94700
Full metadata record
???org.dspace.app.webui.jsptag.ItemTag.dcfield???ValueLanguage
dc.contributor.advisor張智星zh_TW
dc.contributor.advisorJyh-Shing Roger Jangen
dc.contributor.author林育辰zh_TW
dc.contributor.authorYu-Chen Linen
dc.date.accessioned2024-08-16T17:36:06Z-
dc.date.available2024-08-22-
dc.date.copyright2024-08-16-
dc.date.issued2024-
dc.date.submitted2024-08-06-
dc.identifier.citation[1] A. Balaguer, V. Benara, R. L. de Freitas Cunha, R. de M. Estevão Filho, T. Hendry, D. Holstein, et al. Rag vs fine-tuning: Pipelines, tradeoffs, and a case study on agriculture. arXiv preprint arXiv:2401.08406, 2024.
[2] R. A. Bradley and M. E. Terry. Rank analysis of incomplete block designs: I. the method of paired comparisons. Biometrika, 39(3/4):324–345, 1952.
[3] W.-L. Chiang, L. Zheng, Y. Sheng, A. N. Angelopoulos, T. Li, D. Li, et al. Chatbot arena: An open platform for evaluating llms by human preference. arXiv preprint arXiv:2403.04132, 2024.
[4] J. Dean and S. Ghemawat. Mapreduce: simplified data processing on large clusters. Communications of the ACM, 51(1):107–113, 2008.
[5] D. Dua, S. Gupta, S. Singh, and M. Gardner. Successive prompting for decomposing complex questions. arXiv preprint arXiv:2212.04092, 2022.
[6] A. Dubey, A. Jauhri, A. Pandey, A. Kadian, A. Al-Dahle, A. Letman, et al. The llama 3 herd of kids. arXiv preprint arXiv:2407.21783, 2024.
[7] D. Edge, H. Trinh, N. Cheng, J. Bradley, A. Cha, A. Mody, et al. From local to global: A graph rag approach to query-focused summarization. arXiv preprint arXiv:2404.16130, 2024.
[8] B. Efron. Bootstrap methods: another look at the jackknife. In Breakthroughs in statistics: Methodology and distribution, pages 569–593. Springer, 1992.
[9] A. Eliseev and D. Mazur. Fast inference of mixture-of-experts language models with offloading. arXiv preprint arXiv:2312.17238, 2023.
[10] A. Elo. The Rating of Chessplayers: Past and Present. Ishi Press International, 2008.
[11] Y. Gao, Y. Xiong, X. Gao, K. Jia, J. Pan, Y. Bi, et al. Retrieval-augmented generation for large language models: A survey. arXiv preprint arXiv:2312.10997, 2024.
[12] M. E. Glickman and A. C. Jones. Rating the chess rating system. CHANCE-BERLIN THEN NEW YORK-, 12:21–28, 1999.
[13] A. Gu and T. Dao. Mamba: Linear-time sequence modeling with selective state spaces. arXiv preprint arXiv:2312.00752, 2024.
[14] Z. He, H. Wu, X. Zhang, X. Yao, S. Zheng, H. Zheng, et al. Chateda: A large language model powered autonomous agent for eda. In 2023 ACM/IEEE 5th Workshop on Machine Learning for CAD (MLCAD), pages 1–6. IEEE, 2023.
[15] C.-Y. Hsieh, C.-L. Li, C.-K. Yeh, H. Nakhost, Y. Fujii, A. Ratner, et al. Distilling step-by-step! outperforming larger language models with less training data and smaller model sizes. arXiv preprint arXiv:2305.02301, 2023.
[16] D. Huang, Q. Bu, J. M. Zhang, M. Luck, and H. Cui. Agentcoder: Multi-agent code generation with iterative testing and optimization. arXiv preprint arXiv:2312.13000, 2024.
[17] A. Q. Jiang, A. Sablayrolles, A. Roux, A. Mensch, B. Savary, C. Bamford, et al. Mixtral of experts. arXiv preprint arXiv:2401.04088, 2024.
[18] X. Jiang, Y. Dong, L. Wang, Z. Fang, Q. Shang, G. Li, et al. Self-planning code generation with large language models. arXiv preprint arXiv:2303.06689, 2023.
[19] Z. Jiang, F. F. Xu, L. Gao, Z. Sun, Q. Liu, J. Dwivedi-Yu, et al. Active retrieval augmented generation. arXiv preprint arXiv:2305.06983, 2023.
[20] T. Kojima, S. S. Gu, M. Reid, Y. Matsuo, and Y. Iwasawa. Large language models are zero-shot reasoners. arXiv preprint arXiv:2205.11916, 2023.
[21] P. Lewis, E. Perez, A. Piktus, F. Petroni, V. Karpukhin, N. Goyal, et al. Retrieval-augmented generation for knowledge-intensive nlp tasks. arXiv preprint arXiv:2005.11401, 2021.
[22] C. Li, J. Liang, A. Zeng, X. Chen, K. Hausman, D. Sadigh, et al. Chain of code: Reasoning with a language model-augmented code emulator. arXiv preprint arXiv:2312.04474, 2023.
[23] C. Li, J. Wang, Y. Zhang, K. Zhu, W. Hou, J. Lian, et al. Large language models understand and can be enhanced by emotional stimuli. arXiv preprint arXiv:2307.11760, 2023.
[24] Y.-C. Lin, A. Kumar, N. Chang, W. Zhang, M. Zakir, R. Apte, et al. Novel pre-processing technique for data embedding in engineering code generation using large language model. In 1st IEEE International Workshop on LLM-Aided Design, 2024.
[25] Y.-C. Lin, A. Kumar, N. Chang, W. Zhang, M. Zakir, R. Apte, et al. Novel pre-processing technique for data embedding in engineering code generation using large language model. arXiv preprint arXiv:2311.16267, 2024.
[26] M. Liu, N. Pinckney, B. Khailany, and H. Ren. Verilogeval: Evaluating large language models for verilog code generation. arXiv preprint arXiv:2309.07544, 2023.
[27] S. Liu, W. Fang, Y. Lu, Q. Zhang, H. Zhang, and Z. Xie. Rtlcoder: Outperforming gpt-3.5 in design rtl generation with our open-source dataset and lightweight solution. In 1st IEEE International Workshop on LLM-Aided Design, 2024.
[28] S. Minaee, T. Mikolov, N. Nikzad, M. Chengahlu, R. Socher, X. Amatriain, et al. Large language models: A survey. arXiv preprint arXiv:2402.06196, 2024.
[29] A. Mitra, L. D. Corro, S. Mahajan, A. Codas, C. Simoes, S. Agarwal, et al. Orca 2: Teaching small language models how to reason. arXiv preprint arXiv:2311.11045, 2023.
[30] E. Nijkamp, B. Pang, H. Hayashi, L. Tu, H. Wang, Y. Zhou, et al. Codegen: An open large language model for code with multi-turn program synthesis. arXiv preprint arXiv:2203.13474, 2023.
[31] OpenAI, J. Achiam, S. Adler, S. Agarwal, L. Ahmad, I. Akkaya, et al. Gpt-4 technical report. arXiv preprint arXiv:2303.08774, 2023.
[32] B. Peng, C. Li, P. He, M. Galley, and J. Gao. Instruction tuning with gpt-4. arXiv preprint arXiv:2304.03277, 2023.
[33] M. Schäfer, S. Nadi, A. Eghbali, and F. Tip. An empirical evaluation of using large language models for automated unit test generation. arXiv preprint arXiv:2302.06527, 2023.
[34] G. Team, M. Riviere, S. Pathak, P. G. Sessa, C. Hardin, S. Bhupatiraju, et al. Gemma 2: Improving open language models at a practical size. arXiv preprint arXiv:2408.00118, 2024.
[35] S. Thakur, B. Ahmad, H. Pearce, B. Tan, B. Dolan-Gavitt, R. Karri, et al. Verigen: A large language model for verilog code generation. arXiv preprint arXiv:2308.00708, 2023.
[36] H. Touvron, L. Martin, K. Stone, P. Albert, A. Almahairi, Y. Babaei, et al. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288, 2023.
[37] Y.-D. Tsai, M. Liu, and H. Ren. Rtlfixer: Automatically fixing rtl syntax errors with large language models. In Proceedings of the 61th ACM/IEEE Design Automation Conference (DAC’24), 2024.
[38] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, et al. Attention is all you need. Advances in neural information processing systems, 30, 2017.
[39] G. Xiao, Y. Tian, B. Chen, S. Han, and M. Lewis. Efficient streaming language models with attention sinks. arXiv preprint arXiv:2309.17453, 2023.
[40] S. Yao, J. Zhao, D. Yu, N. Du, I. Shafran, K. Narasimhan, et al. React: Synergizing reasoning and acting in language models. arXiv preprint arXiv:2210.03629, 2023.
[41] W. X. Zhao, K. Zhou, J. Li, T. Tang, X. Wang, Y. Hou, et al. A survey of large language models. arXiv preprint arXiv:2303.18223, 2023.
[42] H. S. Zheng, S. Mishra, X. Chen, H.-T. Cheng, E. H. Chi, Q. V. Le, et al. Take a step back: Evoking reasoning via abstraction in large language models. arXiv preprint arXiv:2310.06117, 2023.
[43] L. Zheng, W.-L. Chiang, Y. Sheng, S. Zhuang, Z. Wu, Y. Zhuang, et al. Judging llm-as-a-judge with mt-bench and chatbot arena. arXiv preprint arXiv:2306.05685, 2023.
-
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/94700-
dc.description.abstract本文提出五項基於大型語言模型的程式碼生成組件,用於特定領域的腳本生成,並評估其有效性。貢獻:(i) 基於大型語言模型的語義分割 (Semantic Splitter) 以及資料翻新 (Data Renovation) 組件 以改進資料語義的表示;(ii) 運用大型語言模型重構以產生高品質程式碼的組件 Script Augmentation;(iii) 提出提示技術 隱形知識擴展與思考 (Implicit Knowledge Expansion and Contemplation, IKEC) 組件;(iv) 提出程式碼生成的流程,以五項組件漸進式生成工程模擬軟體 RedHawk-SC 的程式碼;(v) 評估不同參考資料型態之於程式碼生成的有效性。零樣本連鎖思維 (Zero-shot Chain-of-Thought, ZCoT) 為有效的提示技術,包括在五項建設性組件中,以利評估其餘組件之有效性。我們邀請 28 位領域專家透過競技場式評估蒐集 187 份成對比較結果以驗證前述組件之有效性,其中最佳組件於工程軟體 RedHawk-SC 上 MapReduce 程式碼生成表現達到 21.26% 的勝率提升,相較零樣本連鎖思維 6.68% 勝率提升顯著許多。zh_TW
dc.description.abstractWe propose five constructive components based on Large Language Models (LLMs) for domain-specific code generation and evaluate their effectiveness. The contributions are (i) Semantic splitter and data renovation for improved data semantic representation; (ii) Script augmentation for enhanced code quality; (iii) Implicit Knowledge Expansion and Contemplation (IKEC) prompting technique; (iv) A workflow using hierarchical generation for scripts in the engineering software RedHawk-SC; (v) An evaluation of different reference data types for code generation. We invited 28 domain experts to conduct an arena-style evaluation, collecting 187 paired comparisons to validate the effectiveness of those components. The best component achieved a 21.26% win rate improvement in MapReduce code generation performance for RedHawk-SC, significantly outperforming the 6.68% win rate improvement of the Zero-shot Chain-of-Thought (ZCoT).en
dc.description.provenanceSubmitted by admin ntu (admin@lib.ntu.edu.tw) on 2024-08-16T17:36:06Z
No. of bitstreams: 0
en
dc.description.provenanceMade available in DSpace on 2024-08-16T17:36:06Z (GMT). No. of bitstreams: 0en
dc.description.tableofcontentsVerification Letter from the Oral Examination Committee - i
Acknowledgements -------------------------- iii
摘要 ------------------------------------- v
Abstract ---------------------------------- vii
Contents --------------------------------- ix
List of Figures -------------------------- xv
List of Tables --------------------------- xix
Chapter 1 Introduction ------------------- 1
1.1 Research Topic ------------------------ 1
1.2 Motivation ---------------------------- 2
1.3 Challenges ---------------------------- 3
1.4 Contributions ------------------------- 5
1.5 Methodology Overview ------------------ 6
1.6 Overview of Engineering Tools Applicable to the Code Generator - 8
1.6.1 RedHawk-SC ----------------------- 8
1.6.1.1 RedHawk vs RedHawk-SC ------ 8
1.6.1.2 Technical Architecture ----- 9
1.6.2 MapReduce ----------------------- 10
1.7 Chapter Overview ---------------------- 11
Chapter 2 Literature Review -------------- 13
2.1 Large Language Models (LLMs) ---------- 13
2.1.1 Various Large Language Models ---- 14
2.1.2 Retrieval-Augmented Generation (RAG) - 15
2.1.3 RAG vs Fine-tuning --------------- 16
2.1.4 Enhancing Input Tokens ----------- 17
2.1.5 Prompt Techniques and Mechanisms - 17
2.1.6 Distillation --------------------- 18
2.2 Related Work in Code Generation ------- 18
2.3 Evaluation ---------------------------- 20
2.3.1 Chatbot Arena -------------------- 20
2.3.2 Elo ------------------------------ 21
2.3.3 Bradley-Terry Model -------------- 22
2.3.4 Bootstrap ------------------------ 23
Chapter 3 Methodology -------------------- 25
3.1 Problem Definition -------------------- 25
3.1.1 Technical Bottlenecks and Issues in Existing RAG Technologies - 26
3.1.2 Proposed Methods ----------------- 27
3.2 Method Description -------------------- 28
3.2.1 Semantic Splitter ---------------- 29
3.2.2 Data Renovation ------------------ 32
3.2.3 Script Augmentation -------------- 33
3.2.4 Implicit Knowledge Expansion and Contemplation (IKEC) - 35
3.2.5 Zero-shot Chain-of-Thought (ZCoT) - 35
3.2.6 Code Generation Pipeline --------- 36
3.2.6.1 Task Planning -------------- 36
3.2.6.2 Script Generation ---------- 36
3.3 Experimental Methods ------------------ 40
3.3.1 Component and Script Generation -- 40
3.3.2 Combinations --------------------- 41
3.3.3 Selection Strategy for Combinations - 43
3.3.4 Arena-style Evaluation ----------- 44
3.3.4.1 Explanation and Examples --- 45
3.4 Methodology Review -------------------- 46
Chapter 4 Datasets and Experimental Setup - 47
4.1 Application Scenarios ----------------- 47
4.2 Dataset ------------------------------- 48
4.2.1 RAG Reference -------------------- 48
4.2.2 Ansys-RHSC-20 -------------------- 49
4.3 Evaluation Metrics -------------------- 50
4.4 Machine Environment ------------------- 50
4.5 Experimental Environment -------------- 51
4.6 Experimental Parameters --------------- 52
4.7 Roadmap of Experiments ---------------- 53
Chapter 5 Experiments and Analysis -------- 55
5.1 Experiment 1: Arena-style Evaluation of Component Effectiveness in Code Generation - 55
5.1.1 Arena Information ---------------- 56
5.1.2 Pairwise Voting ------------------ 58
5.1.3 Pairwise Voting - Even Sample ---- 61
5.1.4 Elo Rating ----------------------- 62
5.1.5 Elo Rating - Even Sample --------- 62
5.1.6 Bradley-Terry Model -------------- 63
5.1.7 Bradley-Terry Model - Even Sample - 64
5.1.8 Ablation Study ------------------- 69
5.1.8.1 Semantic Splitter ---------- 69
5.1.8.2 Data Renovation ------------ 70
5.1.8.3 Script Augmentation -------- 71
5.1.8.4 Implicit Knowledge Expansion and Contemplation (IKEC) - 72
5.1.8.5 Zero-shot Chain-of-Thought (ZCoT) - 73
5.1.9 Comprehensive Explanation -------- 74
5.2 Experiment 2: Comparison of RAG Data Source Proportions in Different Component Combinations - 76
5.2.1 Data Preprocessing --------------- 76
5.2.2 Prompt Techniques ---------------- 79
Chapter 6 Conclusions --------------------- 81
6.1 Summary of Findings ------------------- 82
6.2 Future Prospects ---------------------- 82
References --------------------------------- 85
-
dc.language.isoen-
dc.subject自然語言處理zh_TW
dc.subject大型語言模型zh_TW
dc.subject程式碼生成zh_TW
dc.subject資料嵌入zh_TW
dc.subject資料前處理zh_TW
dc.subject語義分割zh_TW
dc.subject資料翻新zh_TW
dc.subject腳本擴增zh_TW
dc.subject提示技術zh_TW
dc.subjectData Renovationen
dc.subjectNatural Language Processing (NLP)en
dc.subjectSemantic Splitteren
dc.subjectLarge Language Models (LLMs)en
dc.subjectCode Generationen
dc.subjectData Embeddingen
dc.subjectData Preprocessingen
dc.subjectPrompt Techniquesen
dc.subjectScript Augmentationen
dc.title基於大型語言模型的五項程式碼生成組件及其評估zh_TW
dc.titleProposition and Evaluation of Five Constructive Components for Code Generation via Large Language Modelsen
dc.typeThesis-
dc.date.schoolyear112-2-
dc.description.degree碩士-
dc.contributor.oralexamcommittee李宏毅;張鴻嘉zh_TW
dc.contributor.oralexamcommitteeHung-yi Lee;Norman Changen
dc.subject.keyword自然語言處理,大型語言模型,程式碼生成,資料嵌入,資料前處理,語義分割,資料翻新,腳本擴增,提示技術,zh_TW
dc.subject.keywordNatural Language Processing (NLP),Large Language Models (LLMs),Code Generation,Data Embedding,Data Preprocessing,Semantic Splitter,Data Renovation,Script Augmentation,Prompt Techniques,en
dc.relation.page90-
dc.identifier.doi10.6342/NTU202402417-
dc.rights.note同意授權(全球公開)-
dc.date.accepted2024-08-09-
dc.contributor.author-college電機資訊學院-
dc.contributor.author-dept資訊工程學系-
Appears in Collections:資訊工程學系

Files in This Item:
File SizeFormat 
ntu-112-2.pdf3.89 MBAdobe PDFView/Open
Show simple item record


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved