結合圖譜推理與語意搜尋以實現知識本位的醫學回應生成

趙亦天; I-Tien Chao

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/99252

標題:	結合圖譜推理與語意搜尋以實現知識本位的醫學回應生成 Integrating Graph Reasoning and Semantic Search for Knowledge-Grounded Medical Response Generation
作者:	趙亦天 I-Tien Chao
指導教授:	周承復 Cheng-Fu Chou
關鍵字:	大型語言模型,檢索增強生成,知識圖譜, Large Language Model,Retrieval Augmented Generation,Knowledge Graph,
出版年 :	2025
學位:	碩士
摘要:	大型語言模型(Large Language Model)因擁有龐大的參數量與大規模的預訓練，在一般性問答與推理任務中展現出卓越的表現。於高度專業化領域中，大型語言模型生成內容的可靠性顯得尤為關鍵。此類任務對模型推理的精確性有高度要求，且模型必須避免對事實內容進行扭曲、改寫或捏造。本研究提出一種全新的混合式檢索增強生成(RetrievalAugmented Generation, RAG)框架，專為醫學文獻問答系統而設計。我們的系統結合了同時對知識圖譜資料庫與向量資料庫進行搜尋，實現結構化與非結構化知識的統一檢索。在離線階段，我們針對醫學領域構建專屬的醫學知識圖譜資料庫，將非結構化文本做去雜訊處理，並應用專門設計的醫學實體與關係抽取流程，系統性地提取必要內容。在線上問答階段，我們引入「圖索引機制」(GraphBased Indexing)，將使用者查詢重新轉化為符合我們設計之知識圖結構的形式，提升檢索的精確度。於檢索階段，我們提出「圖引導式檢索」(GraphGuided Retrieval)方法，首先查詢知識圖譜以獲得可信且結構化的資訊，並以此作為引導，進行向量資料庫的語義檢索，確保檢索結果兼具語境完整性與事實依據。在最終的答案生成階段，我們應用「圖增強生成」(GraphEnhanced Generation)技術，對檢索結果進行重排序，優先採用最具相關性與可信度的資訊進行回答綜整。透過這種混合式的檢索與生成策略，我們不僅保留了結構化醫學知識的完整性，更優化了語義相關性，使得生成的回答在準確性、相關性及可解釋性上皆有顯著提升。我們以中文耳鼻喉科醫學語料庫進行實證評估，並採用 RAGAs(Retrieval Augmented Generation Assessment)自動化評估工具進行量化分析，包括真實性 (Faithfulness)、答案相關性 (Answer relevance)、內容精確性 (Context precision) 與內容召回率 (Context recall) 等多項指標。為驗證系統於真實情境中的應用價值，我們亦邀請國立臺灣大學醫學院附設醫院耳鼻喉科臨床醫師參與使用者研究，針對多套問答系統從事跨維度評估。實驗結果顯示，本研究所提出之架構於長篇問答生成與專業醫學推理任務中皆顯著優於現有方法，有效驗證其於醫學知識問答領域中之實用性與可靠性。 Large Language Models (LLMs) have demonstrated remarkable capabilities in general purpose question answering and reasoning, owing to their extensive parameterization and largescale pretraining. However, in specialized domains such as science and medicine, re liability becomes paramount: models must reason accurately over domainspecific knowl edge while avoiding distortion, misrepresentation, or fabrication of facts. In this work, we propose a novel hybrid RetrievalAugmented Generation (RAG) framework specifically designed for medical literature question answering. Our approach unifies a domainspecialized knowledge graph and a vector database, enabling the re trieval of both structured and unstructured knowledge in a single, coherent pipeline. In the offline phase, we construct a medical knowledge graph by denoising unstructured texts and applying a tailored medical entity and relationship extraction pipeline. In the online phase, we introduce a graphbased indexing mechanism that reformulates user queries to align with the knowledge graph schema, thereby improving retrieval precision. Our graphguided retrieval strategy first queries the knowledge graph to obtain reliable, struc tured evidence, which then guides semantic search over the vector database—ensuring retrieval that is both contextually comprehensive and factually grounded. Finally, dur ing answer synthesis, we apply graphenhanced generation, reranking retrieved results to prioritize the most relevant and trustworthy information. This hybrid retrieval–gener ation paradigm preserves the structural integrity of medical knowledge while enhancing semantic relevance, yielding responses that are accurate, relevant, and interpretable. We evaluate our framework on a curated Mandarin otolaryngology corpus from the National Taiwan University Hospital, using the RAGAs (RetrievalAugmented Genera tion Assessment) benchmark to measure faithfulness, answer relevance, context precision, and context recall. To assess realworld applicability, we further conduct a user study with clinicians from the Department of Otolaryngology at NTUH, who rated multiple QA sys tems across key quality dimensions. Results consistently show that our approach outper forms both standard RAG and GraphRAG baselines, achieving significant improvements in factual accuracy, semantic relevance, and contextual grounding—validating its effec tiveness for reliable, domainspecific medical question answering.
URI:	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/99252
DOI:	10.6342/NTU202502664
全文授權:	同意授權(全球公開)
電子全文公開日期:	2025-08-22
顯示於系所單位：	資訊網路與多媒體研究所

文件中的檔案：

檔案	大小	格式
ntu-113-2.pdf	1.29 MB	Adobe PDF	檢視/開啟

顯示文件完整紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料（如：文字、圖片、PDF）並使其易於取用。