Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊網路與多媒體研究所
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/99252
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor周承復zh_TW
dc.contributor.advisorCheng-Fu Chouen
dc.contributor.author趙亦天zh_TW
dc.contributor.authorI-Tien Chaoen
dc.date.accessioned2025-08-21T16:59:44Z-
dc.date.available2025-08-22-
dc.date.copyright2025-08-21-
dc.date.issued2025-
dc.date.submitted2025-08-06-
dc.identifier.citation[1] OpenAI. Gpt­4 technical report, 2024.
[2] Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie­Anne Lachaux, Timothée Lacroix, Baptiste Rozière, Naman Goyal, Eric Hambro, Faisal Azhar, et al. Llama: Open and efficient foundation language models, 2023.
[3] HugoTouvron,LouisMartin,KevinStone,PeterAlbert,AmjadAlmahairi,Yasmine Babaei, Nikolay Bashlykov, Soumya Batra, Prajjwal Bhargava, Shruti Bhosale, et al. Llama 2: Open foundation and fine­tuned chat models, 2023.
[4] AbhimanyuDubey,AbhinavJauhri,AbhinavPandey,AbhishekKadian,AhmadAl­ Dahle, Aiesha Letman, Akhil Mathur, Alan Schelten, Amy Yang, Angela Fan, et al. The llama 3 herd of models, 2024.
[5] Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen tau Yih, Tim Rocktäschel, Se­ bastian Riedel, and Douwe Kiela. Retrieval­augmented generation for knowledge­ intensive nlp tasks, 2021.
[6] Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Brian Ichter, Fei Xia, Ed Chi, Quoc Le, and Denny Zhou. Chain­of­thought prompting elicits reasoning in large language models, 2023.
[7] Renqian Luo, Liai Sun, Yingce Xia, Tao Qin, Sheng Zhang, Hoifung Poon, and Tie­ Yan Liu. Biogpt: generative pre­trained transformer for biomedical text generation and mining. Briefings in Bioinformatics, 23(6), September 2022.
[8] Karan Singhal, Tao Tu, Juraj Gottweis, Rory Sayres, Ellery Wulczyn, Mohamed Amin, Le Hou, Kevin Clark, Stephen R Pfohl, Heather Cole­Lewis, et al. Towards expert­level medical question answering with large language models, 2023.
[9] Augustin Toma, Patrick R. Lawler, Jimmy Ba, Rahul G. Krishnan, Barry B. Rubin, and Bo Wang. Clinical camel: An open expert­level medical language model with dialogue­based knowledge encoding, 2023.
[10] Thomas Savage, Ashwin Nayak, Robert Gallo, Ekanath Rangan, and Jonathan H Chen. Diagnostic reasoning prompts reveal the potential for large language model interpretability in medicine, 2023.
[11] Khaled Saab, Tao Tu, Wei­Hung Weng, Ryutaro Tanno, David Stutz, Ellery Wul­ czyn, Fan Zhang, Tim Strother, Chunjong Park, Elahe Vedadi, et al. Capabilities of gemini models in medicine, 2024.
[12] Jiaqi Wang, Enze Shi, Sigang Yu, Zihao Wu, Chong Ma, Haixing Dai, Qiushi Yang, Yanqing Kang, Jinru Wu, Huawen Hu, et al. Prompt engineering for healthcare: Methodologies and applications, 2024.
[13] Jing Miao, Charat Thongprayoon, Supawadee Suppadungsuk, Oscar A Garcia Va­ lencia, and Wisit Cheungpasitporn. Integrating retrieval­augmented generation with large language models in nephrology: advancing practical applications. Medicina, 60(3):445, 2024.
[14] GuangzhiXiong,QiaoJin,ZhiyongLu,andAidongZhang.Benchmarkingretrieval­ augmented generation for medicine, 2024.
[15] JundeWu,JiayuanZhu,YunliQi,JingkunChen,MinXu,FilippoMenolascina,and Vicente Grau. Medical graph rag: Towards safe medical large language model via graph retrieval­augmented generation, 2024.
[16] DarrenEdge,HaTrinh,NewmanCheng,JoshuaBradley,AlexChao,ApurvaMody, Steven Truitt, Dasha Metropolitansky, Robert Osazuwa Ness, and Jonathan Larson. From local to global: A graph rag approach to query­focused summarization, 2025.
[17] Shahul Es, Jithin James, Luis Espinosa­Anke, and Steven Schockaert. Ragas: Au­ tomated evaluation of retrieval augmented generation, 2025.
-
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/99252-
dc.description.abstract大型語言模型(Large Language Model)因擁有龐大的參數量與大規模的預訓 練,在一般性問答與推理任務中展現出卓越的表現。於高度專業化領域中,大型 語言模型生成內容的可靠性顯得尤為關鍵。此類任務對模型推理的精確性有高度 要求,且模型必須避免對事實內容進行扭曲、改寫或捏造。
本研究提出一種全新的混合式檢索增強生成(Retrieval­Augmented Generation, RAG)框架,專為醫學文獻問答系統而設計。我們的系統結合了同時對知識圖譜 資料庫與向量資料庫進行搜尋,實現結構化與非結構化知識的統一檢索。在離線 階段,我們針對醫學領域構建專屬的醫學知識圖譜資料庫,將非結構化文本做去 雜訊處理,並應用專門設計的醫學實體與關係抽取流程,系統性地提取必要內 容。在線上問答階段,我們引入「圖索引機制」(Graph­Based Indexing),將使用 者查詢重新轉化為符合我們設計之知識圖結構的形式,提升檢索的精確度。於檢 索階段,我們提出「圖引導式檢索」(Graph­Guided Retrieval)方法,首先查詢知 識圖譜以獲得可信且結構化的資訊,並以此作為引導,進行向量資料庫的語義檢 索,確保檢索結果兼具語境完整性與事實依據。在最終的答案生成階段,我們應 用「圖增強生成」(Graph­Enhanced Generation)技術,對檢索結果進行重排序, 優先採用最具相關性與可信度的資訊進行回答綜整。透過這種混合式的檢索與生 成策略,我們不僅保留了結構化醫學知識的完整性,更優化了語義相關性,使得 生成的回答在準確性、相關性及可解釋性上皆有顯著提升。
我們以中文耳鼻喉科醫學語料庫進行實證評估,並採用 RAGAs(Retrieval­ Augmented Generation Assessment)自動化評估工具進行量化分析,包括真實性 (Faithfulness)、答案相關性 (Answer relevance)、內容精確性 (Context precision) 與 內容召回率 (Context recall) 等多項指標。為驗證系統於真實情境中的應用價值, 我們亦邀請國立臺灣大學醫學院附設醫院耳鼻喉科臨床醫師參與使用者研究,針 對多套問答系統從事跨維度評估。實驗結果顯示,本研究所提出之架構於長篇問 答生成與專業醫學推理任務中皆顯著優於現有方法,有效驗證其於醫學知識問答 領域中之實用性與可靠性。
zh_TW
dc.description.abstractLarge Language Models (LLMs) have demonstrated remarkable capabilities in general­ purpose question answering and reasoning, owing to their extensive parameterization and large­scale pretraining. However, in specialized domains such as science and medicine, re­ liability becomes paramount: models must reason accurately over domain­specific knowl­ edge while avoiding distortion, misrepresentation, or fabrication of facts.
In this work, we propose a novel hybrid Retrieval­Augmented Generation (RAG) framework specifically designed for medical literature question answering. Our approach unifies a domain­specialized knowledge graph and a vector database, enabling the re­ trieval of both structured and unstructured knowledge in a single, coherent pipeline. In the offline phase, we construct a medical knowledge graph by denoising unstructured texts and applying a tailored medical entity and relationship extraction pipeline. In the online phase, we introduce a graph­based indexing mechanism that reformulates user queries to align with the knowledge graph schema, thereby improving retrieval precision. Our graph­guided retrieval strategy first queries the knowledge graph to obtain reliable, struc­ tured evidence, which then guides semantic search over the vector database—ensuring retrieval that is both contextually comprehensive and factually grounded. Finally, dur­ ing answer synthesis, we apply graph­enhanced generation, re­ranking retrieved results to prioritize the most relevant and trustworthy information. This hybrid retrieval–gener­ ation paradigm preserves the structural integrity of medical knowledge while enhancing semantic relevance, yielding responses that are accurate, relevant, and interpretable.
We evaluate our framework on a curated Mandarin otolaryngology corpus from the National Taiwan University Hospital, using the RAGAs (Retrieval­Augmented Genera­ tion Assessment) benchmark to measure faithfulness, answer relevance, context precision, and context recall. To assess real­world applicability, we further conduct a user study with clinicians from the Department of Otolaryngology at NTUH, who rated multiple QA sys­ tems across key quality dimensions. Results consistently show that our approach outper­ forms both standard RAG and GraphRAG baselines, achieving significant improvements in factual accuracy, semantic relevance, and contextual grounding—validating its effec­ tiveness for reliable, domain­specific medical question answering.
en
dc.description.provenanceSubmitted by admin ntu (admin@lib.ntu.edu.tw) on 2025-08-21T16:59:44Z
No. of bitstreams: 0
en
dc.description.provenanceMade available in DSpace on 2025-08-21T16:59:44Z (GMT). No. of bitstreams: 0en
dc.description.tableofcontentsVerification Letter from the Oral Examination Committee i
Acknowledgements iii
摘要 v
Abstract vii
Contents ix
List of Figures xi
List of Tables xiii
Denotation xv
Chapter 1 Introduction 1
Chapter 2 Preliminaries 7
2.1 Chain­of­ThoughtPrompting...................... 7
Chapter 3 Related Work 9
3.1 LLMforMedicine ........................... 9
3.2 Retrieval­AugmentedGeneration ................... 10
Chapter 4 Methodology 11
4.1 System Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
4.2 Offline–DatabaseConstruction .................... 12
4.2.1 Graph­BasedDatabase ........................ 12
4.2.1.1 Text Preprocessing ­ Semantic Chunking . . . . . . . . 13
4.2.1.2 Text Preprocessing ­ Denoising and Normalization . . . 13
4.2.1.3 Entity Detection and Relationship Linking . . . . . . . 14
4.2.1.4 Cypher Query Generation and Community Detection . 16
4.2.2 Vector­BasedDatabase ........................ 16
4.3 Online­QueryTimeWorkflow .................... 17
4.3.1 Graph­BasedIndexing ........................ 17
4.3.2 Graph­GuidedRetrieval........................ 19
4.3.3 Graph­EnhancedGeneration ..................... 20
Chapter 4 Experiments 21
5.1 ExperimentalSettings ......................... 21
5.1.1 DataSource.............................. 21
5.1. 2VectorDatabaseConstruction..................... 22
5.2 EvaluationwithRAGAs ........................ 22
5.3 UserStudy ............................... 25
Chapter 6 Conclusion 27
References 29
Appendix A — Database construction setting hyperparameters 33
A.1 Knowledgegraphconstructionsettings . . . . . . . . . . . . . . . . 33
A.2 Vectordatabaseconstructionsettings ................. 33
Appendix B — Entity detection prompt and Relationship linking prompt 35
B.1 EntityDetectionSystemPrompt.................... 35
B.2 RelationshipLinkingSystemPrompt ................. 36
-
dc.language.isoen-
dc.subject檢索增強生成zh_TW
dc.subject大型語言模型zh_TW
dc.subject知識圖譜zh_TW
dc.subjectKnowledge Graphen
dc.subjectLarge Language Modelen
dc.subjectRetrieval­ Augmented Generationen
dc.title結合圖譜推理與語意搜尋以實現知識本位的醫學回應生成zh_TW
dc.titleIntegrating Graph Reasoning and Semantic Search for Knowledge-­Grounded Medical Response Generationen
dc.typeThesis-
dc.date.schoolyear113-2-
dc.description.degree碩士-
dc.contributor.oralexamcommittee吳曉光;呂政修;蔡瑞煌;吳振吉zh_TW
dc.contributor.oralexamcommitteeHsiao-kuang Wu;Jenq-Shiou Leu;Rua-Huan Tsaih;Chen-Chi Wuen
dc.subject.keyword大型語言模型,檢索增強生成,知識圖譜,zh_TW
dc.subject.keywordLarge Language Model,Retrieval­ Augmented Generation,Knowledge Graph,en
dc.relation.page37-
dc.identifier.doi10.6342/NTU202502664-
dc.rights.note同意授權(全球公開)-
dc.date.accepted2025-08-09-
dc.contributor.author-college電機資訊學院-
dc.contributor.author-dept資訊網路與多媒體研究所-
dc.date.embargo-lift2025-08-22-
顯示於系所單位:資訊網路與多媒體研究所

文件中的檔案:
檔案 大小格式 
ntu-113-2.pdf1.29 MBAdobe PDF檢視/開啟
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved