Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊工程學系
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/95138
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor陳信希zh_TW
dc.contributor.advisorHsin-Hsi Chenen
dc.contributor.author吳承光zh_TW
dc.contributor.authorCheng-Kuang Wuen
dc.date.accessioned2024-08-29T16:15:48Z-
dc.date.available2024-08-30-
dc.date.copyright2024-08-29-
dc.date.issued2024-
dc.date.submitted2024-08-15-
dc.identifier.citation[1] Jerome P Kassirer. Diagnostic reasoning. Annals of internal medicine, 110(11):893–900, 1989.
[2] Friedemann Ohm, Daniela Vogel, Susanne Sehner, Marjo Wijnen-Meijer, and Sigrid Harendza. Details acquired from medical history and patients'experience of empathy–two sides of the same coin. BMC medical education, 13(1):1–7, 2013.
[3] Kai-Fu Tang, Hao-Cheng Kao, Chun-Nan Chou, and Edward Y Chang. Inquire and diagnose: Neural symptom checking ensemble using deep reinforcement learning. In NIPS workshop on deep reinforcement learning, 2016.
[4] Hao-Cheng Kao, Kai-Fu Tang, and Edward Chang. Context-aware symptom checking for disease diagnosis using hierarchical reinforcement learning. In Proceedings
of the AAAI conference on artificial intelligence, volume 32, 2018.
[5] Zhongyu Wei, Qianlong Liu, Baolin Peng, Huaixiao Tou, Ting Chen, Xuan-Jing Huang, Kam-Fai Wong, and Xiang Dai. Task-oriented dialogue system for automatic diagnosis. In Proceedings of the 56th Annual Meeting of the Association for
Computational Linguistics (Volume 2: Short Papers), pages 201–207, 2018.
[6] Lin Xu, Qixian Zhou, Ke Gong, Xiaodan Liang, Jianheng Tang, and Liang Lin. End-to-end knowledge-routed relational dialogue system for automatic diagnosis. In Proceedings of the AAAI conference on artificial intelligence, volume 33, pages 7346–7353, 2019.
[7] Yuan Xia, Jingbo Zhou, Zhenhui Shi, Chao Lu, and Haifeng Huang. Generative adversarial regularized mutual information policy gradient framework for automatic diagnosis. In Proceedings of the AAAI conference on artificial intelligence, volume 34, pages 1062–1069, 2020.
[8] Junying Chen, Dongfang Li, Qingcai Chen, Wenxiu Zhou, and Xin Liu. Diaformer: Automatic diagnosis via symptoms sequence generation. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, pages 4432–4440, 2022.
[9] Zhenyu Hou, Yukuo Cen, Ziding Liu, Dongxue Wu, Baoyan Wang, Xuanhe Li, Lei Hong, and Jie Tang. Mtdiag: an effective multi-task framework for automatic diagnosis. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37,
pages 14241–14248, 2023.
[10] Huimin Wang, Wai Chung Kwan, Kam-Fai Wong, and Yefeng Zheng. Coad: Automatic diagnosis through symptom and disease collaborative generation. In Proceedings of the 61st Annual Meeting of the Association for Computational
Linguistics (Volume 1: Long Papers), pages 6348–6361, 2023.
[11] Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901, 2020.
[12] Jason Wei, Maarten Bosma, Vincent Y Zhao, Kelvin Guu, Adams Wei Yu, Brian Lester, Nan Du, Andrew M Dai, and Quoc V Le. Finetuned language models are zero-shot learners. arXiv preprint arXiv:2109.01652, 2021.
[13] Karan Singhal, Shekoofeh Azizi, Tao Tu, S Sara Mahdavi, Jason Wei, Hyung Won Chung, Nathan Scales, Ajay Tanwani, Heather Cole-Lewis, Stephen Pfohl, et al. Large language models encode clinical knowledge. arXiv preprint
arXiv:2212.13138, 2022.
[14] Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, and Martin Riedmiller. Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602, 2013.
[15] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need. Advances in neural information processing systems, 30, 2017.
[16] Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Fei Xia, Ed Chi, Quoc V Le, Denny Zhou, et al. Chain-of-thought prompting elicits reasoning in large language models. Advances in neural information processing systems, 35:24824-24837, 2022.
[17] Takeshi Kojima, Shixiang Shane Gu, Machel Reid, Yutaka Matsuo, and Yusuke Iwasawa. Large language models are zero-shot reasoners. Advances in neural information processing systems, 35:22199–22213, 2022.
[18] Karl Cobbe, Vineet Kosaraju, Mohammad Bavarian, Mark Chen, Heewoo Jun, Lukasz Kaiser, Matthias Plappert, Jerry Tworek, Jacob Hilton, Reiichiro Nakano, et al. Training verifiers to solve math word problems. arXiv preprint
arXiv:2110.14168, 2021.
[19] Mor Geva, Daniel Khashabi, Elad Segal, Tushar Khot, Dan Roth, and Jonathan Berant. Did aristotle use a laptop? a question answering benchmark with implicit reasoning strategies. Transactions of the Association for Computational Linguistics, 9:346–361, 2021.
[20] Katharina E Keifenheim, Martin Teufel, Julianne Ip, Natalie Speiser, Elisabeth J Leehr, Stephan Zipfel, and Anne Herrmann-Werner. Teaching history taking to medical students: a systematic review. BMC medical education, 15:1–12, 2015.
[21] Arsene Fansi Tchango, Rishab Goel, Zhi Wen, Julien Martel, and Joumana Ghosn. Ddxplus: A new dataset for automatic medical diagnosis. Advances in neural information processing systems, 35:31306–31318, 2022.
[22] Cheng-Kuang Wu, Wei-Lin Chen, and Hsin-Hsi Chen. Large language models perform diagnostic reasoning. arXiv preprint arXiv:2307.08922, 2023.
[23] Andrea Madotto, Zhaojiang Lin, Genta Indra Winata, and Pascale Fung. Few-shot bot: Prompt-based learning for dialogue systems. arXiv preprint arXiv:2110.08118, 2021.
[24] Long Ouyang, Jeffrey Wu, Xu Jiang, Diogo Almeida, Carroll Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, et al. Training language models to follow instructions with human feedback. Advances in neural information processing systems, 35:27730–27744, 2022.
[25] Cheng-Kuang Wu, Zhi Rui Tam, Chieh-Yen Lin, Yun-Nung Chen, and Hung-yi Lee. Streambench: Towards benchmarking continuous improvement of language agents. arXiv preprint arXiv:2406.08747, 2024.
[26] Sewon Min, Xinxi Lyu, Ari Holtzman, Mikel Artetxe, Mike Lewis, Hannaneh Hajishirzi, and Luke Zettlemoyer. Rethinking the role of demonstrations: What makes in-context learning work? In Proceedings of the 2022 Conference on Empirical
Methods in Natural Language Processing, pages 11048–11064, 2022.
[27] Xinxi Lyu, Sewon Min, Iz Beltagy, Luke Zettlemoyer, and Hannaneh Hajishirzi. Z-icl: Zero-shot in-context learning with pseudo-demonstrations. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2304–2317, 2023.
[28] Yilun Du, Shuang Li, Antonio Torralba, Joshua B Tenenbaum, and Igor Mordatch. Improving factuality and reasoning in language models through multiagent debate. In Forty-first International Conference on Machine Learning.
[29] Justin Chih-Yao Chen, Swarnadeep Saha, and Mohit Bansal. Reconcile: Round-table conference improves reasoning via consensus among diverse llms. arXiv preprint arXiv:2309.13007, 2023.
-
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/95138-
dc.description.abstract醫療診斷推理是臨床工作中的重要能力,它讓醫師能從病人身上「蒐集關鍵資訊」,並且利用蒐集到的資訊「預測診斷」。本論文以問診作為研究案例,探討大型語言模型之診斷推理能力,並提出進一步提升此推理能力之方法論。此項研究之主要貢獻有三:
一、發展大型語言模型角色扮演評估框架,用以評估大型語言模型之問診能力;
二、提出少樣本、零樣本之提示工程方法論,此方法論結合醫師診斷推理之思考過程,並以實驗佐證其能提升大型語言模型「蒐集關鍵資訊」以及「預測診斷」之表現。
三、顯示大型語言模型能透過儲存及擷取自生成之診斷推理過程,持續增進其診斷預測之能力。
zh_TW
dc.description.abstractMedical diagnostic reasoning is a crucial capability in clinical practice, enabling physicians to "collect key information" from patients and "predict diagnoses" based on the collected information. This thesis investigates the diagnostic reasoning abilities of large language models (LLMs) using history taking as a case study and proposes methodologies to further enhance these reasoning abilities. The main contributions of this research are threefold:
(1) Development of an LLM Role-Playing Evaluation Framework to assess the history-taking abilities of LLMs.
(2) Introduction of few-shot and zero-shot prompting methodologies that integrate the diagnostic reasoning processes of physicians, with experimental evidence demonstrating their effectiveness in improving LLMs' performance in ``collecting key information'' and ``predicting diagnoses''.
(3) Showing that LLMs can continuously improve their diagnostic prediction capabilities through storing and retrieving self-generated diagnostic reasoning processes.
en
dc.description.provenanceSubmitted by admin ntu (admin@lib.ntu.edu.tw) on 2024-08-29T16:15:48Z
No. of bitstreams: 0
en
dc.description.provenanceMade available in DSpace on 2024-08-29T16:15:48Z (GMT). No. of bitstreams: 0en
dc.description.tableofcontentsAcknowledgements i
摘要 iv
Abstract v
Contents vi
List of Figures viii
List of Tables ix
Chapter 1 Introduction 1
1.1 Background 1
1.2 Motivation 2
1.3 Thesis Organization 4
Chapter 2 Related Work 5
2.1 History Taking 5
2.2 Reasoning Abilities of Large Language Models 6
Chapter 3 Datasets 8
3.1 Patient Profile 8
3.2 Data Statistics 9
Chapter 4 Collecting Diagnostic Information 10
4.1 The LLM-Role-Playing Evaluation Framework 10
4.2 Evaluation Metric 12
4.3 Experiments 12
4.3.1 Few-Shot Setting 12
4.3.1.1 Methodology 12
4.3.1.2 Results 13
4.3.2 Zero-Shot Setting 15
4.3.2.1 Methodology 15
4.3.2.2 Results 16
Chapter 5 Diagnosis Prediction 21
5.1 General Setup 21
5.2 Methodology 22
5.2.1 Non-streaming setting 22
5.2.2 Streaming setting 23
5.3 Experiments 25
5.3.1 Implementation 25
5.3.2 Results 25
5.4 Discussion 27
5.4.1 Confusion Matrices of Single-Agent and Multi-Agent Memory Methods 27
5.4.2 Ablation Studies on Multi-Agent Memory 28
Chapter 6 Conclusions 30
References 32
-
dc.language.isoen-
dc.subject醫療診斷推理zh_TW
dc.subject大型語言模型zh_TW
dc.subject問診zh_TW
dc.subjectHistory Takingen
dc.subjectLarge Language Modelsen
dc.subjectMedical Diagnostic Reasoningen
dc.title大型語言模型進行醫療診斷推理zh_TW
dc.titleLarge Language Models Perform Diagnostic Reasoningen
dc.typeThesis-
dc.date.schoolyear112-2-
dc.description.degree碩士-
dc.contributor.oralexamcommittee張嘉惠;王釧茹;鄭卜壬zh_TW
dc.contributor.oralexamcommitteeChia-Hui Chang;Chuan-Ju Want;Pu-Jen Chengen
dc.subject.keyword大型語言模型,醫療診斷推理,問診,zh_TW
dc.subject.keywordLarge Language Models,Medical Diagnostic Reasoning,History Taking,en
dc.relation.page36-
dc.identifier.doi10.6342/NTU202403801-
dc.rights.note同意授權(全球公開)-
dc.date.accepted2024-08-15-
dc.contributor.author-college電機資訊學院-
dc.contributor.author-dept資訊工程學系-
顯示於系所單位:資訊工程學系

文件中的檔案:
檔案 大小格式 
ntu-112-2.pdf1.67 MBAdobe PDF檢視/開啟
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved