Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊工程學系
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/99661
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor傅立成zh_TW
dc.contributor.advisorLi-Chen Fuen
dc.contributor.author李昱辰zh_TW
dc.contributor.authorYu-Chen Lien
dc.date.accessioned2025-09-17T16:17:54Z-
dc.date.available2025-09-18-
dc.date.copyright2025-09-17-
dc.date.issued2025-
dc.date.submitted2025-08-06-
dc.identifier.citation[1] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need. Advances in neural information processing systems, 30, 2017.
[2] Jingfeng Yang, Hongye Jin, Ruixiang Tang, Xiaotian Han, Qizhang Feng, Haoming Jiang, Shaochen Zhong, Bing Yin, and Xia Hu. Harnessing the power of llms in practice: A survey on chatgpt and beyond. ACM Transactions on Knowledge Discovery from Data, 18(6):1–32, 2024.
[3] Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. Bert: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 1 (long and short papers), pages 4171–4186, 2019.
[4] Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901, 2020.
[5] Aakanksha Chowdhery, Sharan Narang, Jacob Devlin, Maarten Bosma, Gaurav Mishra, Adam Roberts, Paul Barham, Hyung Won Chung, Charles Sutton, Sebastian Gehrmann, et al. Palm: Scaling language modeling with pathways. Journal of Machine Learning Research, 24(240):1–113, 2023.
[6] Karan Singhal, Shekoofeh Azizi, Tao Tu, S Sara Mahdavi, Jason Wei, Hyung Won Chung, Nathan Scales, Ajay Tanwani, Heather Cole-Lewis, Stephen Pfohl, et al. Large language models encode clinical knowledge. Nature, 620(7972):172–180, 2023.
[7] Karan Singhal, Tao Tu, Juraj Gottweis, Rory Sayres, Ellery Wulczyn, Mohamed Amin, Le Hou, Kevin Clark, Stephen R Pfohl, Heather Cole-Lewis, et al. Toward expert-level medical question answering with large language models. Nature Medicine, pages 1–8, 2025.
[8] Khaled Saab, Tao Tu, Wei-Hung Weng, Ryutaro Tanno, David Stutz, Ellery Wulczyn, Fan Zhang, Tim Strother, Chunjong Park, Elahe Vedadi, et al. Capabilities of gemini models in medicine. arXiv preprint arXiv:2404.18416, 2024.
[9] Chaoyi Wu, Weixiong Lin, Xiaoman Zhang, Ya Zhang, Weidi Xie, and Yanfeng Wang. Pmc-llama: toward building open-source language models for medicine. Journal of the American Medical Informatics Association, 31(9):1833–1843, 2024.
[10] Cyril Zakka, Rohan Shad, Akash Chaurasia, Alex R Dalal, Jennifer L Kim, Michael Moor, Robyn Fong, Curran Phillips, Kevin Alexander, Euan Ashley, et al. Al-manac--retrieval-augmented language models for clinical medicine. Nejm ai, 1(2):Aioa2300068, 2024.
[11] Polat Goktas and Andrzej Grzybowski. Shaping the future of healthcare: Ethical clinical challenges and pathways to trustworthy ai. Journal of Clinical Medicine, 14(5):1605, 2025.
[12] Ariel Soares Teles, Alaa Abd-Alrazaq, Thomas F Heston, Rafat Damseh, and Livia Ruback. Large language models for medical applications, 2025.
[13] Rashed Isam Ashqar, Anas Ashqar, Mohammed Elehnawy, and Huthaifa I Ashqar. Using large language models for medical diagnosis: A survey and future trends. Authorea Preprints, 2025.
[14] Qiao Jin, Bhuwan Dhingra, Zhengping Liu, William W Cohen, and Xinghua Lu. Pubmedqa: A dataset for biomedical research question answering. arXiv preprint arXiv:1909.06146, 2019.
[15] Ankit Pal, Logesh Kumar Umapathi, and Malaikannan Sankarasubbu. Medmcqa: A large-scale multi-subject multi-choice dataset for medical domain question answering. In Conference on health, inference, and learning, pages 248–260. PMLR, 2022.
[16] Lucia Zheng, Neel Guha, Brandon R Anderson, Peter Henderson, and Daniel E Ho. When does pretraining help? assessing self-supervised learning for law and the case-hold dataset of 53,000+ legal holdings. In Proceedings of the eighteenth international conference on artificial intelligence and law, pages 159–168, 2021.
[17] Asma Ben Abacha and Dina Demner-Fushman. A question-entailment approach to question answering. BMC Bioinform., 20(1):511:1–511:23, 2019.
[18] John R.Ball, Bryan T Miller, and Erin P Balogh. Improving diagnosis in health care. 2015.
[19] Bowen Dong, Minheng Ni, Zitong Huang, Guanglei Yang, Wangmeng Zuo, and Lei Zhang. Mirage: Assessing hallucination in multimodal reasoning chains of mllm. arXiv preprint arXiv:2505.24238, 2025.
[20] Hadas Orgad, Michael Toker, Zorik Gekhman, Roi Reichart, Idan Szpektor, Hadas Kotek, and Yonatan Belinkov. Llms know more than they show: On the intrinsic representation of llm hallucinations. arXiv preprint arXiv:2410.02707, 2024.
[21] Stella Li, Vidhisha Balachandran, Shangbin Feng, Jonathan Ilgen, Emma Pierson, Pang Wei W Koh, and Yulia Tsvetkov. Mediq: Question-asking ilms and a benchmark for reliable interactive clinical reasoning. Advances in Neural Information Processing Systems, 37:28858–28888, 2024.
[22] Juncheng Wu, Wenlong Deng, Xingxuan Li, Sheng Liu, Taomian Mi, Yifan Peng, Ziyang Xu, Yi Liu, Hyunjin Cho, Chang-In Choi, et al. Medreason: Eliciting factual medical reasoning steps in llms via knowledge graphs. arXiv preprint arXiv:2504.00993, 2025.
[23] Harsha Nori, Yin Tat Lee, Sheng Zhang, Dean Carignan, Richard Edgar, Nicolo Fusi, Nicholas King, Jonathan Larson, Yuanzhi Li, Weishung Liu, et al. Can generalist foundation models outcompete special-purpose tuning? case study in medicine. arXiv preprint arXiv:2311.16452, 2023.
[24] Junde Wu, Jiayuan Zhu, Yunli Qi, Jingkun Chen, Min Xu, Filippo Menolascina, and Vicente Grau. Medical graph rag: Towards safe medical large language model via graph retrieval-augmented generation. arXiv preprint arXiv:2408.04187, 2024.
[25] Kaishuai Xu, Yi Cheng, Wenjun Hou, Qiaoyu Tan, and Wenjie Li. Reasoning like a doctor: Improving medical dialogue systems via diagnostic reasoning process alignment. arXiv preprint arXiv:2406.13934, 2024.
[26] Kaiwen Zuo, Yirui Jiang, Fan Mo, and Pietro Lio. Kg4diagnosis: A hierarchical multi-agent llm framework with knowledge graph enhancement for medical diagnosis. In AAAI Bridge Program on AI for Medicine and Healthcare, pages 195–204. PMLR, 2025.
[27] Yanjun Gao, Ruizhe Li, Emma Croxford, John Caskey, Brian W Patterson, Matthew Churpek, Timothy Miller, Dmitriy Dligach, and Majid Afshar. Leveraging medical knowledge graphs into large language models for diagnosis prediction: Design and application study. Jmir Ai, 4:e58670, 2025.
[28] Daniel Hausmann, Cristina Zulian, Edouard Battegay, and Lukas Zimmerli. Tracing the decision-making process of physicians with a decision process matrix. BMC medical informatics and decision making, 16:1–11, 2016.
[29] Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J Liu. Exploring the limits of transfer learning with a unified text-to-text transformer. Journal of machine learning research, 21(140):1–67, 2020.
[30] Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Ves Stoyanov, and Luke Zettlemoyer. Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. arXiv preprint arXiv:1910.13461, 2019.
[31] Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Fei Xia, Ed Chi, Quoc V Le, Denny Zhou, et al. Chain-of-thought prompting elicits reasoning in large language models. Advances in neural information processing systems, 35:24824–24837, 2022.
[32] Thomas N Kipf and Max Welling. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907, 2016.
[33] Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. How powerful are graph neural networks? arXiv preprint arXiv:1810.00826, 2018.
[34] Petar Veličković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and Yoshua Bengio. Graph attention networks. arXiv preprint arXiv:1710.10903, 2017.
[35] Olivier Bodenreider. The unified medical language system (umls): integrating biomedical terminology. Nucleic acids research, 32(suppl_1):D267–D270, 2004.
[36] Nadine J McCleary, Ellana K Haakenstad, Jessica LF Cleveland, Michael Manni, Michael J Hassett, and Deb Schrag. Framework for integrating electronic patient-reported data in routine cancer care: an oncology intake questionnaire. JAMIA open, 5(3):ooac064, 2022.
[37] Mark Neumann, Daniel King, Iz Beltagy, and Waleed Ammar. Scispacy: fast and robust models for biomedical natural language processing. arXiv preprint arXiv:1902.07669, 2019.
[38] Fangyu Liu, Ehsan Shareghi, Zaiqiao Meng, Marco Basaldella, and Nigel Collier. Self-alignment pretraining for biomedical entity representations. arXiv preprint arXiv:2010.11784, 2020.
[39] Kai Wang, Weizhou Shen, Yunyi Yang, Xiaojun Quan, and Rui Wang. Relational graph attention network for aspect-based sentiment analysis. arXiv preprint arXiv:2004.12382, 2020.
[40] Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, and Piotr Dollár. Focal loss for dense object detection. In Proceedings of the IEEE international conference on computer vision, pages 2980–2988, 2017.
[41] Yan Chen, Shuai Sun, and Xiaochun Hu. Drkg: Faithful and interpretable multi-hop knowledge graph question answering via llm-guided reasoning plans. Applied Sciences, 15(12):6722, 2025.
-
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/99661-
dc.description.abstract大型語言模型(LLM)在醫療推理任務上已展現巨大潛力,但現有的研究大多是在資訊完整的理想化情境下進行,這與臨床實踐中資訊不完整的現實有所脫節。為解決此問題,本研究提出了一個新穎的互動式推理框架,旨在賦予LLM主動提問以收集關鍵資訊的能力。我們的核心方法是設計一個知識圖譜推理器(KG Reasoner),它能在專業的醫療知識圖譜中探索,以識別當前最重要的資訊缺口,從而為LLM的提問提供策略性指引。此外,我們提出一個受臨床鑑別診斷思維啟發的信心評估機制,讓系統能夠準確評估自身的不確定性,並在必要時觸發提問,而非做出草率的決策。
為了驗證我們方法的有效性,我們在一個互動式醫學問答基準上進行了實驗。實驗結果表明,相較於如MedIQ等現有的基準模型,我們的框架能透過更具策略性的提問,有效收集關鍵的病患資訊,並避免了因過早診斷關閉而導致的錯誤。在最終的問答準確率上,我們的模型取得了顯著的提升,證明了本研究所提出的KG引導的提問策略,是讓AI模型更貼近真實臨床實踐、提升其可靠性與安全性的可行途徑。
zh_TW
dc.description.abstractWhile Large Language Models (LLMs) have demonstrated significant potential in medical reasoning, existing research is often conducted under the idealized assumption of complete information, which contrasts with the reality of incomplete information in clinical practice. To address this gap, this thesis proposes a novel interactive reasoning framework designed to empower LLMs with the ability to proactively ask questions to gather critical information. Our core methodology involves a Knowledge Graph (KG) Reasoner that explores a professional medical KG to identify the most crucial information gaps, thereby providing strategic guidance for the LLM's questioning. Furthermore, we introduce a confidence estimation mechanism inspired by the clinical process of differential diagnosis, enabling the system to accurately assess its own uncertainty and trigger questions when necessary, rather than making premature decisions.
To validate our approach, we conducted experiments on an interactive medical question-answering benchmark. The results demonstrate that, compared to existing baselines like MedIQ, our framework can effectively gather key patient information through more strategic questioning and avoid errors caused by premature diagnostic closure. Our model achieves a significant improvement in final question-answering accuracy, proving that the proposed KG-guided questioning strategy is a viable path toward making AI models more aligned with real-world clinical workflows, thereby enhancing their reliability and safety.
en
dc.description.provenanceSubmitted by admin ntu (admin@lib.ntu.edu.tw) on 2025-09-17T16:17:54Z
No. of bitstreams: 0
en
dc.description.provenanceMade available in DSpace on 2025-09-17T16:17:54Z (GMT). No. of bitstreams: 0en
dc.description.tableofcontentsVerification Letter from the Oral Examination Committee i
Acknowledgements ii
摘要 iii
Abstract iv
Contents vi
List of Figures ix
List of Tables xi
Chapter 1 Introduction 1
1.1 Background 1
1.2 Motivation 2
1.3 Related Work 3
1.3.1 LLMs for Medical Question Answering 3
1.3.2 Interactive Approaches for Medical Reasoning 4
1.3.3 Knowledge Graph Integration in Medical AI 5
1.3.4 Comparison with Our Approach 5
1.4 Contribution 6
1.5 Thesis Organization 8
Chapter 2 Preliminaries 9
2.1 Large Language Models 9
2.1.1 Transformer Architecture 10
2.1.2 Architectural Variants of Language Models 11
2.1.3 Generative Pre-trained Transformer (GPT) Models 13
2.1.4 Prompt Engineering 14
2.2 Graph Neural Networks 15
2.2.1 Message Passing 15
2.2.2 Variants of Graph Neural Networks 16
2.3 Knowledge Graphs 17
2.3.1 Defining Knowledge Graphs 17
2.3.2 Medical Knowledge Graph 18
Chapter 3 Methodology 19
3.1 System Overview 19
3.2 Problem Formulation 20
3.3 Data Preprocessing and Label Generation 22
3.3.1 Label Path Construction 22
3.3.2 Semantic Filtering 23
3.3.3 BM25-based Information Value Filtering 24
3.4 KG Reasoner Model 25
3.4.1 SapBERT Encoding 26
3.4.2 GNN Aggregation 27
3.4.3 Path Exploration and Encoding 29
3.4.4 Path Ranking and Loss Function 30
3.5 LLM Pipeline for Questioning 32
3.5.1 Path Analysis and Summarization 32
3.5.2 Differential Confidence Estimation 33
3.5.3 3-Stage Strategic Questioning 34
Chapter 4 Experiments 40
4.1 Experimental Setup 40
4.1.1 Dataset and Benchmark 40
4.1.2 Evaluation Metrics 43
4.1.3 Baseline Models 44
4.1.4 Implementation Details 45
4.2 Results 46
4.2.1 CUI Prediction Performance 46
4.2.2 Downstream QA Task Performance 48
4.3 Qualitative Analysis: Case Study 50
4.3.1 Case Study with Predefined Options 50
4.3.2 Reasoning without Predefined Options: Dynamic Hypothesis Generation 55
Chapter 5 Conclusion 58
References 61
Appendix A — Generative Patient 68
A.1 Motivation and Design 68
A.2 Implementation 69
-
dc.language.isoen-
dc.subject知識圖譜zh_TW
dc.subject大型語言模型zh_TW
dc.subject互動式AIzh_TW
dc.subject醫療推理zh_TW
dc.subject主動提問zh_TW
dc.subjectProactive Questioningen
dc.subjectMedical Reasoningen
dc.subjectInteractive AIen
dc.subjectKnowledge Graphen
dc.subjectLarge Language Modelsen
dc.title基於知識圖譜引導大型語言模型主動提問:應用於多輪交互式醫療推理zh_TW
dc.titleKG-Guided Proactive Questioning for LLMs in Multi-turn Interactive Medical Reasoningen
dc.typeThesis-
dc.date.schoolyear113-2-
dc.description.degree碩士-
dc.contributor.oralexamcommittee李宏毅;許永真;施吉昇zh_TW
dc.contributor.oralexamcommitteeHung-Yi Lee;Yung-Jen Hsu;Chi-Sheng Shihen
dc.subject.keyword大型語言模型,知識圖譜,主動提問,醫療推理,互動式AI,zh_TW
dc.subject.keywordLarge Language Models,Knowledge Graph,Proactive Questioning,Medical Reasoning,Interactive AI,en
dc.relation.page70-
dc.identifier.doi10.6342/NTU202504066-
dc.rights.note未授權-
dc.date.accepted2025-08-12-
dc.contributor.author-college電機資訊學院-
dc.contributor.author-dept資訊工程學系-
dc.date.embargo-liftN/A-
顯示於系所單位:資訊工程學系

文件中的檔案:
檔案 大小格式 
ntu-113-2.pdf
  未授權公開取用
13.52 MBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved