請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/96601
完整後設資料紀錄
DC 欄位 | 值 | 語言 |
---|---|---|
dc.contributor.advisor | 曾宇鳳 | zh_TW |
dc.contributor.advisor | Yufeng Jane Tseng | en |
dc.contributor.author | 游子慧 | zh_TW |
dc.contributor.author | Tzu-Hui Yu | en |
dc.date.accessioned | 2025-02-20T16:08:50Z | - |
dc.date.available | 2025-02-21 | - |
dc.date.copyright | 2025-02-20 | - |
dc.date.issued | 2025 | - |
dc.date.submitted | 2025-01-13 | - |
dc.identifier.citation | A. K. Adupa, R. P. Garg, J. Corona-Cox, S. Shah, S. R. Jonnalagadda, et al. An information extraction approach to prescreen heart failure patients for clinical trials. arXiv preprint arXiv:1609.01594, 2016.
M. R. Boland, S. W. Tu, S. Carini, I. Sim, and C. Weng. Elixr-time: a temporal knowledge representation for clinical research eligibility criteria. AMIA summits on translational science proceedings, 2012:71, 2012. E. Buhle Jr, J. Goldwein, and I. Benjamin. Oncolink: a multimedia oncology informa- tion resource on the internet. In Proceedings of the Annual Symposium on Computer Application in Medical Care, page 103. American Medical Informatics Association, 1994. P. M. Calverley, R. J. Nordyke, R. Halbert, S. Isonaka, and D. Nonikov. Development of a population-based screening questionnaire for copd. COPD: Journal of Chronic Obstructive Pulmonary Disease, 2(2):225–232, 2005. F. Chung, H. R. Abdullah, and P. Liao. Stop-bang questionnaire: a practical approach to screen for obstructive sleep apnea. Chest, 149(3):631–638, 2016. S. Datta, K. Lee, H. Paek, F. J. Manion, N. Ofoegbu, J. Du, Y. Li, L.-C. Huang, J. Wang, B. Lin, et al. Autocriteria: a generalizable clinical trial eligibility criteria extraction system powered by large language models. Journal of the American Medical Informatics Association, 31(2):375–385, 2024. M. DeBellis, N. Duttab, J. Ginoc, and A. Balajid. Integrating ontologies and large language models to implement retrieval augmented generation (rag). Applied Ontology, 1:1–5, 2024. K. Durden, P. Hurley, D. L. Butler, A. Farner, S. P. Shriver, and M. E. Fleury. Provider motivations and barriers to cancer clinical trial screening, referral, and operations: Find- ings from a survey. Cancer, 130(1):68–76, 2024. M. E. Fleury. Consensus recommendations for improving the cancer clinical trial matching environment. Cancer, 130(1):11–15, 2024. T. Haddad, J. M. Helgeson, K. E. Pomerleau, A. M. Preininger, M. C. Roebuck, I. Dankwa- Mullan, G. P. Jackson, and M. P. Goetz. Accuracy of an artificial intelligence system for cancer clinical trial eligibility screening: retrospective pilot study. JMIR Medical Informatics, 9(3):e27767, 2021. D. M. d. Hamer, P. Schoor, T. B. Polak, and D. Kapitan. Improving patient pre-screening for clinical trials: assisting physicians with large language models. arXiv preprint arXiv:2304.07396, 2023. T. Hao, H. Liu, and C. Weng. Valx: a system for extracting and structuring numeric lab test comparison statements from text. Methods of information in medicine, 55(03): 266–275, 2016. Q. Jin, Z. Wang, C. S. Floudas, F. Chen, C. Gong, D. Bracken-Clarke, E. Xue, Y. Yang, J. Sun, and Z. Lu. Matching patients to clinical trials with large language models. ArXiv, 2023. T. Kang, S. Zhang, Y. Tang, G. W. Hruby, A. Rusanov, N. Elhadad, and C. Weng. Eliie: An open-source information extraction system for clinical trial eligibility criteria. Journal of the American Medical Informatics Association, 24(6):1062–1071, 2017. E. S. Kim, D. Bernstein, S. G. Hilsenbeck, C. H. Chung, A. P. Dicker, J. L. Ersek, S. Stein, F. R. Khuri, E. Burgess, K. Hunt, et al. Modernizing eligibility criteria for molecularly driven trials. Journal of Clinical Oncology, 33(25):2815–2820, 2015. C. Liu, C. Yuan, A. M. Butler, R. D. Carvajal, Z. R. Li, C. N. Ta, and C. Weng. Dquest: dynamic questionnaire for search of clinical trials. Journal of the American Medical Informatics Association, 26(11):1333–1343, 2019. X. Liu, G. L. Hersch, I. Khalil, and M. Devarakonda. Clinical trial information extraction with bert. In 2021 IEEE 9th International Conference on Healthcare Informatics (ICHI), pages 505–506. IEEE, 2021. Z. Luo, R. Duffy, S. Johnson, and C. Weng. Corpus-based approach to creating a semantic lexicon for clinical research eligibility criteria from umls. Summit on Translational Bioinformatics, 2010:26, 2010. K. Milian, A. Bucur, and A. Ten Teije. Formalization of clinical trial eligibility criteria: Evaluation of a pattern-based approach. In 2012 IEEE International Conference on Bioinformatics and Biomedicine, pages 1–4. IEEE, 2012a. K. Milian, A. Ten Teije, A. Bucur, and F. van Harmelen. Patterns of clinical trial eligi- bility criteria. In Knowledge Representation for Health-Care: AIME 2011 Workshop KR4HC 2011, Bled, Slovenia, July 2-6, 2011. Revised Selected Papers 3, pages 145– 157. Springer, 2012b. R. Miotto, S. Jiang, and C. Weng. etacts: a method for dynamically filtering clinical trial search results. Journal of biomedical informatics, 46(6):1060–1067, 2013. M. Nievas, A. Basu, Y. Wang, and H. Singh. Distilling large language models for matching patients to clinical trials. Journal of the American Medical Informatics Association, page ocae073, 2024. G. K. Parai, C. Jonquet, R. Xu, M. A. Musen, and N. H. Shah. The lexicon builder web service: building custom lexicons from two hundred biomedical ontologies. In AMIA Annual Symposium Proceedings, volume 2010, page 587. American Medical Informat- ics Association, 2010. G. Peikos, S. Symeonidis, P. Kasela, and G. Pasi. Utilizing chatgpt to enhance clinical trial enrollment. arXiv preprint arXiv:2306.02077, 2023. A. Ralevski, N. Taiyab, M. Nossal, L. Mico, S. Piekos, and J. Hadlock. Using large language models to abstract complex social determinants of health from original and deidentified medical notes: Development and validation study. Journal of Medical Internet Research, 26:e63445, 2024. W. S. Saba. Stochastic llms do not understand language: towards symbolic, explainable and ontologically based llms. In International Conference on Conceptual Modeling, pages 3–19. Springer, 2023. M. Tahmid Rahman Laskar, S. Alqahtani, M. Saiful Bari, M. Rahman, M. Abdullah Matin Khan, H. Khan, I. Jahan, A. Bhuiyan, C. W. Tan, M. R. Parvez, et al. A sys- tematic survey and critical review on evaluating large language models: Challenges, limitations, and recommendations. arXiv e-prints, pages arXiv–2407, 2024. S. Tian, A. Erdengasileng, X. Yang, Y. Guo, Y. Wu, J. Zhang, J. Bian, and Z. He. Transformer-based named entity recognition for parsing clinical trial eligibility crite- ria. In Proceedings of the 12th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics, pages 1–6, 2021. Y. Tseo, M. Salkola, A. Mohamed, A. Kumar, and F. Abnousi. Information extraction of clinical trial eligibility criteria. arXiv preprint arXiv:2006.07296, 2020. J. M. Unger, R. Vaidya, D. L. Hershman, L. M. Minasian, and M. E. Fleury. Systematic review and meta-analysis of the magnitude of structural, clinical, and physician and pa- tient barriers to cancer clinical trial participation. JNCI: Journal of the National Cancer Institute, 111(3):245–255, 2019. Z. Wang, C. Xiao, and J. Sun. Autotrial: prompting language models for clinical trial design. arXiv preprint arXiv:2305.11366, 2023. C. Weng, X. Wu, Z. Luo, M. R. Boland, D. Theodoratos, and S. B. Johnson. Elixr: an approach to eligibility criteria extraction and representation. Journal of the American Medical Informatics Association, 18(Supplement_1):i116–i124, 2011. C. Wong, S. Zhang, Y. Gu, C. Moung, J. Abel, N. Usuyama, R. Weerasinghe, B. Piening, T. Naumann, C. Bifulco, et al. Scaling clinical trial matching using large language models: a case study in oncology. In Machine Learning for Healthcare Conference, pages 846–862. PMLR, 2023. M. Wornow, Y. Xu, R. Thapa, B. Patel, E. Steinberg, S. Fleming, M. A. Pfeffer, J. Fries, and N. H. Shah. The shaky foundations of large language models and foundation models for electronic health records. npj Digital Medicine, 6(1):135, 2023. M. Wornow, A. Lozano, D. Dash, J. Jindal, K. W. Mahaffey, and N. H. Shah. Zero-shot clinical trial patient matching with llms. arXiv preprint arXiv:2402.05125, 2024. C. Yuan, P. B. Ryan, C. Ta, Y. Guo, Z. Li, J. Hardin, R. Makadia, P. Jin, N. Shang, T. Kang, et al. Criteria2query: a natural language interface to clinical databases for cohort defini- tion. Journal of the American Medical Informatics Association, 26(4):294–305, 2019. K. Zeng, Z. Pan, Y. Xu, and Y. Qu. An ensemble learning strategy for eligibility criteria text classification for clinical trial recruitment: algorithm development and validation. JMIR Medical Informatics, 8(7):e17832, 2020. C. ZiHang, S. QianMin, C. GaoYi, H. JiHan, and L. Ying. Application scheme of clinical trial questionnaire pre recruitment integrating llm and knowledge graph. Available at SSRN 4713177, 2024. | - |
dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/96601 | - |
dc.description.abstract | 傳統的臨床試驗招募方式通常以站點為中心,依賴患者資格篩查。這種方式 效率較低,且往往難以招募足夠的參與者。患者一般處於被動等待試驗匹配的狀 態,這使他們無法有效比較並選擇最適合的試驗。現有的以患者為中心的系統雖 然簡化了資格標準並提供問卷,但仍需大量人工來處理標準重疊或重複的問題。
為了改善這一情況,我們推出了 Patient-LITE,一個基於大型語言模型 (LLMs)技術的系統,能夠生成對於病患而言更易懂的問卷。它能將來自多個試 驗的資格標準整合成一份簡潔的表單,達到 74-78% 的準確率。其技術創新在於混合專家方法 (mixure-of-experts),將多個基於 GPT 的模型與優化策略相結合。 主要特徵包括:(1)獨創的「分解後組合」資料管道流程(data pipeline),(2) 經過微調(fine-tuning) 的小型模型,用於高質量且具成本效益的問題生成,以及 (3)提示工程以協助文字擷取與邏輯推理。 Patient-LITE 是首個基於 LLM 的跨試驗問卷設計框架,並可根據不同疾病進 行靈活調整。我們提供了評估指標和微調數據集,對試驗資格的表示、分類和驗 證文獻做出了貢獻。這項研究期待為後續醫學研究者提供啟發,協助其開發技術, 促進醫學知識的普及。 | zh_TW |
dc.description.abstract | The traditional site-centric approach to clinical trial recruitment, which relies on patient eligibility screening, is often inefficient and struggles to enroll sufficient participants. Patients typically wait passively for trial matches, limiting their ability to compare and select the most suitable trials. Existing patient-centric systems simplify complex criteria and provide questionnaires, but still require significant manual effort to navigate overlapping or duplicated criteria.
We introduce Patient-LITE, a tool that uses large language models (LLMs) to generate patient-friendly questionnaires consolidating eligibility criteria from multiple trials into one concise form, achieving a correctness rate of 74-78%. The technical innovation lies in a mixture-of-experts approach, combining multiple GPT-based models and optimization strategies. Key features include: (1) a break-then-assemble pipeline for cross-trial criterion extraction, (2) fine-tuned smaller models for high-quality, cost-efficient question generation, and (3) prompt engineering for context-aware logic and attribute extraction. Patient-LITE is the first LLM-based framework for generating inter-trial questionnaires, with adaptable prompts for various diseases. We provide evaluation metrics and a fine-tuned dataset, contributing to the literature on trial eligibility representation, classification, and verification. Our work can inspire researchers in the broader medical field to adopt similar methods, enhancing patient comprehension and democratizing medical knowledge. | en |
dc.description.provenance | Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2025-02-20T16:08:50Z No. of bitstreams: 0 | en |
dc.description.provenance | Made available in DSpace on 2025-02-20T16:08:50Z (GMT). No. of bitstreams: 0 | en |
dc.description.tableofcontents | 摘要 ................................................. i
Abstract ............................................ iii Contents ............................................ v List of Figures ...................................... ix List of Tables ....................................... xi Chapter 1: Introduction 1 1.1 Backgrounds ........................................ 1 1.1.1 The Prevailing Recruitment Paradigm: Site-Centric Matching ......... 1 1.1.2 Transitioning to Patient-Centric Recruitment Solutions ............ 2 1.1.3 LLMs for Enhanced Patient-Centric Eligibility Matching ............ 4 1.2 Patient-LITE and Its Contribution .................. 6 1.2.1 Aims and Scope .................................. 6 1.2.2 Research Questions .............................. 6 1.2.3 Technical Breakthrough and Clinical Relevance ... 7 1.2.4 Contribution and Limitations ................... 9 Chapter 2: Literature Review 11 2.1 Trial Eligibility Extraction and Classification .... 11 2.2 Trial Eligibility Query and Questionnaire Generation 13 Chapter 3: Methods 15 3.1 System Overview .................................... 15 3.2 The "Breaker" Component ............................ 17 3.2.1 Pre-classification Stage ....................... 17 3.2.2 Classification Stage ........................... 19 3.3 The "Assembler" Component ......................... 21 3.3.1 Question Generation ............................ 21 3.3.2 Option Curation ................................ 23 3.4 Data and Model ..................................... 24 3.4.1 Dataset ........................................ 24 3.4.2 Model .......................................... 25 Chapter 4: Results 27 4.1 Descriptive Statistics ............................. 27 4.2 Sample Output ...................................... 29 4.2.1 Overview of Output ............................. 29 4.2.2 Highlight I: Fine-grained Categorization Enables Cross-Trial Criterion Integration ............. 31 4.2.3 Highlight II: Fine-tuning Enables Significant Improvement Using Low-cost Models .................. 34 4.2.4 Highlight III: Prompt Engineering Enables Context-Aware Logic and Attribute Extraction .......... 35 4.3 Quantitative and Qualitative Evaluation ............ 36 4.3.1 Overall Quantitative Performance ............... 36 4.3.2 Error Analysis on the Breaker .................. 39 4.3.3 Error Analysis on the Assembler ................ 42 Chapter 5: Discussion ................................ 47 Chapter 6: Conclusion ................................ 51 References .......................................... 53 | - |
dc.language.iso | en | - |
dc.title | Patient-LITE:利用大型語言模型,打造以患者為中心的通用型臨床試驗資格篩選問卷 | zh_TW |
dc.title | Patient-LITE: Empowering Patients with a Generalizable LLM-Based Questionnaire Generator for Accelerated Trial Eligibility Screening | en |
dc.type | Thesis | - |
dc.date.schoolyear | 113-1 | - |
dc.description.degree | 碩士 | - |
dc.contributor.oralexamcommittee | 蘇柏翰;林盈安 | zh_TW |
dc.contributor.oralexamcommittee | Bo-Han Su;Ying-An Lin | en |
dc.subject.keyword | 大型語言模型,臨床試驗媒合,病患導向, | zh_TW |
dc.subject.keyword | Large Language Models (LLMs),Clinical Trial Matching,Patient-Centric, | en |
dc.relation.page | 58 | - |
dc.identifier.doi | 10.6342/NTU202500069 | - |
dc.rights.note | 未授權 | - |
dc.date.accepted | 2025-01-13 | - |
dc.contributor.author-college | 電機資訊學院 | - |
dc.contributor.author-dept | 生醫電子與資訊學研究所 | - |
dc.date.embargo-lift | N/A | - |
顯示於系所單位: | 生醫電子與資訊學研究所 |
文件中的檔案:
檔案 | 大小 | 格式 | |
---|---|---|---|
ntu-113-1.pdf 目前未授權公開取用 | 1.1 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。