請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/97092完整後設資料紀錄
| DC 欄位 | 值 | 語言 |
|---|---|---|
| dc.contributor.advisor | 周呈霙 | zh_TW |
| dc.contributor.advisor | Cheng-Ying Chou | en |
| dc.contributor.author | 吳松濤 | zh_TW |
| dc.contributor.author | Sung-Tao Wu | en |
| dc.date.accessioned | 2025-02-27T16:09:23Z | - |
| dc.date.available | 2025-02-28 | - |
| dc.date.copyright | 2025-02-27 | - |
| dc.date.issued | 2025 | - |
| dc.date.submitted | 2025-02-10 | - |
| dc.identifier.citation | [1] Rinaldo Bellomo, John A. Kellum, and Claudio Ronco. Acute kidney injury. Lancet, 380(9843):756–66, 2012.
[2] A. Khwaja. KDIGO clinical practice guidelines for acute kidney injury. Nephron Clin Pract., 120(4):c179–84, 2012. [3] H. R. de Geus, M. G. Betjes, and J. Bakker. Biomarkers for the prediction of acute kidney injury: a narrative review on current status and future challenges. Clinical Kidney Journal, 5(2):102–108, 2012. [4] T. Ali, I. Khan, W. Simpson, G. Prescott, J. Townend, W. Smith, and A. MacLeod. Incidence and outcomes in acute kidney injury: a comprehensive population-based study. Journal of the American Society of Nephrology, 18(4):1292–1298, 2007. [5] John Kellum, Paola Romagnani, Gloria Ashuntantang, Claudio Ronco, Alexander Zarbock, and Hans-Joachim Anders. Acute kidney injury. Nature Reviews Disease Primers, 7:52, 2021. [6] J. A. Kellum, N. Lameire, K.A.G.W. Group, and et al. Diagnosis, evaluation, and management of acute kidney injury: a kdigo summary (part 1). Critical care, 17(1):204, 2013. [7] N. Lameire, J. A. Kellum, K.A.G.W. Group, and et al. Contrast-induced acute kidney injury and renal support for acute kidney injury: a kdigo summary (part 2). Critical Care, 17(1):205, 2013. [8] C. V. Thakar, A. Christianson, R. Freyberg, P. Almenoff, and M. L. Render. Incidence and outcomes of acute kidney injury in intensive care units: a veterans administration study. Critical care medicine, 37(9):2552–2558, 2009. [9] H. E. Wang, P. Muntner, G. M. Chertow, and D. G. Warnock. Acute kidney injury and mortality in hospitalized patients. American Journal of Nephrology, 35(4):349–355, 2012. [10] T. Hernandez-Boussard, K.L. Monda, B.C. Crespo, and D. Riskin. Real world evidence in cardiovascular medicine: ensuring data validity in electronic health record-based studies. Journal of the American Medical Informatics Association, 26(11):1189–1194, 2019. [11] A. E. W. Johnson et al. Mimic-iii, a freely accessible critical care database. Sci. Data, 3:160035, 2016. [12] A.E.W. Johnson, L. Bulgarelli, L. Shen, et al. Mimic-iv, a freely accessible electronic health record dataset. Sci Data, 10:1, 2023. [13] J Devlin, MW Chang, K Lee, and K Toutanova. Bert: Pre-training of deep bidirec tional transformers for language understanding in: Proceedings of the 2019 confer ence of the north american chapter of the association for computational linguistics, 4171–4186.. acl. ACL. DOI: https://doi. org/10.18653/v1, (19):1423, 2019. [14] J. Lee, W. Yoon, S. Kim, D. Kim, S. Kim, C. H. So, and J. Kang. Biobert: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics, 2019. [15] Emily Alsentzer, John R Murphy, Willie Boag, Wei-Hung Weng, Di Jin, Tristan Naumann, and Matthew McDermott. Publicly available clinical bert embeddings. arXiv preprint arXiv:1904.03323, 2019. [16] Joyce C. Ho, Joydeep Ghosh, Steve R. Steinhubl, Walter F. Stewart, Joshua C. Denny, Bradley A. Malin, and Jimeng Sun. Limestone: High-throughput candidate phenotype generation via tensor factorization. Journal of Biomedical Informatics, 52:199–211, 2014. [17] Jiayu Zhou, Fei Wang, Jianying Hu, and Jieping Ye. From micro to macro: data driven phenotyping by densification of longitudinal electronic medical records. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 135–144. ACM, 2014. [18] Yoni Halpern, Steven Horng, Youngduck Choi, and David Sontag. Electronic med ical record phenotyping using the anchor and learn framework. Journal of the American Medical Informatics Association, 23(4):731–740, 2016. [19] L. Z. Zhou, X. B. Yang, Y. Guan, X. Xu, M. T. Tan, F. F. Hou, and P. Y. Chen. Development and validation of a risk score for prediction of acute kidney injury in patients with acute decompensated heart failure: a prospective cohort study in china. Journal of the American Heart Association, 5(11):e004035, 2016. [20] J.B. O’Neal, A.D. Shaw, and F.T. 4th Billings. Acute kidney injury following cardiac surgery: current understanding and future directions. Crit Care, 20(1):187, 2016. [21] R. J. Kate, R. M. Perez, D. Mazumdar, K. S. Pasupathy, and V. Nilakantan. Prediction and detection models for acute kidney injury in hospitalized older adults. BMC Medical Informatics and Decision Making, 16(1):39, 2016. [22] L. N. Sanchez-Pinto and R. G. Khemani. Development of a prediction model of early acute kidney injury in critically ill children using electronic health record data. Pediatric Critical Care Medicine, 17(6):508–515, 2016. [23] M. A. Perazella. The urine sediment as a biomarker of kidney disease. American Journal of Kidney Diseases, 66(5):748–755, 2015. [24] K. F. Kerr, A. Meisner, H. Thiessen-Philbrook, S. G. Coca, and C. R. Parikh. De veloping risk prediction models for kidney injury and assessing incremental value for novel biomarkers. Clinical Journal of the American Society of Nephrology, 9(8):1488–1496, 2014. [25] Y. Li, L. Yao, C. Mao, A. Srivastava, X. Jiang, and Y. Luo. Early prediction of acute kidney injury in critical care setting using clinical notes. In 2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pages 683–686, 2018. [26] N. Tomaˇsev, X. Glorot, J. W. Rae, M. Zielinski, H. Askham, A. Saraiva, A. Mottram, C. Meyer, S. Ravuri, and I. Protsyuk. A clinically applicable approach to continuous prediction of future acute kidney injury. Nature, 572(7767):116–119, 2019. [27] M. Sun, J. Baron, A. Dighe, P. Szolovits, R.G. Wunderink, T. Isakova, and Y. Luo. Early prediction of acute kidney injury in critical care setting using clinical notes and structured multivariate physiological measurements. Stud Health Technol Inform, 264:368–372, 2019. [28] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need. Advances in neural information processing systems, 30, 2017. [29] Juntao Yu, Bernd Bohnet, and Massimo Poesio. Named entity recognition as depen dency parsing. arXiv preprint arXiv:2005.07150, 2020. [30] I. Yamada, A. Asai, H. Shindo, H. Takeda, and Y. Matsumoto. Luke: deep contextu alized entity representations with entity-aware self-attention. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 6442–6454, 2020. [31] X. Li et al. Dice loss for data-imbalanced nlp tasks. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 465–476, 2020. [32] B. Xu, Q. Wang, Y. Lyu, Y. Zhu, and Z. Mao. Entity structure within and through out: modeling mention dependencies for document-level relation extraction. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 14149–14157, 2021. [33] D. Ye, Y. Lin, and M. Sun. Pack together: entity and relation extraction with lev itated marker. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics, volume 1, pages 4904–4917, 2021. [34] A. D. Cohen, S. Rosenman, and Y. Goldberg. Relation classification as two-way span-prediction. ArXiv, arXiv:2010.04829, 2021. [35] S. Lyu and H. Chen. Relation classification with entity type restriction. In Findings of the Association for Computational Linguistics: ACL-IJCNLP, pages 390–395, 2021. [36] J. Wang and W. Lu. Two are better than one: joint entity and relation extraction with table-sequence encoders. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1706–1721, 2020. [37] H. Jiang et al. Smart: Robust and efficient fine-tuning for pre-trained natural lan guage models through principled regularized optimization. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 2177– 2190, 2020. [38] Z. Yang et al. Xlnet: Generalized autoregressive pretraining for language under standing. In Proceedings of the 33rd International Conference on Neural Information Processing Systems, pages 5753–5763, 2019. [39] C. Raffel et al. Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res., 21:1–67, 2019. [40] Z.-Z. Lan et al. Albert: a lite bert for self-supervised learning of language represen tations. ArXiv, arXiv:1909.11942, 2019. [41] Z. Zhang, J. Yang, and H. Zhao. Retrospective reader for machine reading com prehension. In Proceedings of the AAAI Conference on Artificial Intelligence, vol ume 35, pages 14506–14514, 2021. [42] S. Garg, T. Vu, and A. Moschitti. Tanda: transfer and adapt pre-trained transformer models for answer sentence selection. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, pages 7780–7788, 2020. [43] R. Bommasani et al. On the opportunities and risks of foundation models. ArXiv, arXiv:2108.07258, 2021. [44] Y. Gu et al. Domain-specific language model pretraining for biomedical natural language processing. ACM Trans. Comput. Healthc., 3:1–23, 2022. [45] H.-C. Shin et al. Biomegatron: larger biomedical domain language model. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 4700–4706, 2020. [46] 郭唯崢. 利用機器學習方法和 mimic 資料庫早期預測加護病房中的急性腎衰竭. Master’s thesis, 國立臺灣大學, Jan 2022. | - |
| dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/97092 | - |
| dc.description.abstract | 背景: 急性腎損傷(AKI)是一種會迅速惡化腎功能的嚴重病症,尤其在加護病房(ICU)環境中會導致顯著發病率與死亡率。傳統 AKI 診斷主要依賴血清肌酸酐和尿量等指標,這些方法可能存在診斷延遲問題。早期識別 AKI 對於及時介入治療和改善患者預後至關重要。本研究探索利用先進自然語言處理(NLP)技術挖掘臨床病歷文本,通過分析 MIMIC-III 資料集中的臨床敘事記錄來提升 AKI早期檢測能力。
目標: 本研究提出 AKIBERT 模型,該模型採用 AKI 特定領域語料庫進行預訓練並整合時間序列數據分析。旨在開發一個用於 ICU 環境 AKI 預測的綜合系統,並探討最先進語言模型在此領域的應用潛力。 方法: 本研究採用結構化臨床特徵與非結構化臨床文本兩種數據形式。經預處理後,包含 AKI 狀態與生命徵象的結構化數據集篩選出 33,825 位患者。針對非結構化數據,使用基於 BERT 的嵌入方法(包括 BERT、BioBERT、ClinicalBERT、BCBERT),結合 LSTM 模型處理入院後 24 小時內的臨床數據。評估指標包含AUROC、特異性與敏感性,並採用校準曲線評估模型可靠性。本研究在結構化預測框架內比較各 BERT 變體的效能差異,並創新性地開發 AKI 領域專用模型AKIBERT 以提升文本預測能力。 結果: 採用最大規模 AKI 專用語料庫預訓練的 AKIBERT T7 模型,在 24-36小時與 24-168 小時預測窗口分別達到 0.8728 與 0.8488 的 AUROC 值。校準分析顯示 AKIBERT 具有更佳的概率校準特性。並在兩個預測時間窗中,AKIBERT 均顯著優於標準 BERT 模型,這表明即使是有限度的 AKI 領域適應性訓練,也能通過增強臨床文本解析能力顯著提升 AKI 預測效能。 | zh_TW |
| dc.description.abstract | Background: Acute Kidney Injury (AKI) is a severe condition that rapidly impairs kidney function, leading to significant morbidity and mortality, especially in ICU settings. Traditionally, AKI is diagnosed using indicators such as serum creatinine and urine output, which may delay detection. Early recognition of AKI is crucial for timely intervention and improving patient outcomes. This research explores advanced Natural Language Process ing (NLP) to mine clinical notes, aiming to improve early detection of AKI by utilizing the insights in clinical narratives from the MIMIC-III dataset.
Objectives: This study introduces AKIBERT, a new AKI-domain model pretrained with an AKI-specific corpus, incorporating time-series data analysis. The aim is to de velop a comprehensive system for predicting AKI in ICU settings and to explore the ca pabilities of state-of-the-art language models in this context. Methods: I employed two data types: structured clinical features and unstructured clinical notes. The structured dataset, encompassing AKI status and vital signs, was filtered down to 33,825 patients after preprocessing. For unstructured data, I utilized BERT based embedding methods, including BERT, BioBERT, ClinicalBERT, BCBERT, fol lowed by LSTM models to process clinical data from the initial 24 hours post-admission. I evaluated performance metrics such as AUROC, specificity, and sensitivity, and em ployed calibration curves to assess the reliability of the models. The efficacy of various BERT-based embedding methods was compared within a structured prediction framework. Moreover, this research introduced AKIBERT, a domain-specific BERT model, developed to enhance prediction capabilities for AKI-related texts. Results: AKIBERT T7, the model pretrained with the most extensive AKI-specific corpus, achieved AUROC values of 0.8728 and 0.8488 for the 24-36 and 24-168 hour pre diction windows, respectively. Calibration analysis underscores AKIBERT’s reliability, showing better probability alignment compared to other BERT variants. Notably, in both prediction windows, AKIBERT consistently outperforms the standard BERT model, in dicating that even limited AKI-specific training can significantly improve AKI prediction by enhancing the interpretation of clinical narratives. | en |
| dc.description.provenance | Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2025-02-27T16:09:23Z No. of bitstreams: 0 | en |
| dc.description.provenance | Made available in DSpace on 2025-02-27T16:09:23Z (GMT). No. of bitstreams: 0 | en |
| dc.description.tableofcontents | Verification Letter from the Oral Examination Committee i
摘要 ii Abstract iv Contents vi List of Figures ix List of Tables xi Denotation xii Chapter 1 Introduction 1 Chapter 2 Literature Review 3 2.1 Data-driven sub-phenotyping with EHRs . . . . . . . . . . . . . . . 3 2.2 AKI prediction and clinical narratives . . . . . . . . . . . . . . . . . 4 2.3 Advancements in NLP for clinical text analysis . . . . . . . . . . . . 4 2.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Chapter 3 Methodology 7 3.1 KDIGO guildline . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 3.2 Study population . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 3.3 WordPiece tokenization . . . . . . . . . . . . . . . . . . . . . . . . 9 3.4 Preprocessing of Clinical Texts for BERT Pretraining . . . . . . . . . 10 3.5 Development of AKI Domain-Specific BERT Model . . . . . . . . . 11 3.6 Data preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 3.6.1 Existing embedding methods: BERT and its variants . . . . . . . . 12 3.6.2 Structured data: clinical features . . . . . . . . . . . . . . . . . . . 15 3.6.3 Handling missing data . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.7 LSTM structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 Chapter 4 Results 18 4.1 Model performances with 24h-36h prediction window . . . . . . . . 18 4.1.1 Calibration curve analysis . . . . . . . . . . . . . . . . . . . . . . . 21 4.1.1.1 BERT-variant embeddings . . . . . . . . . . . . . . . . 22 4.1.1.2 AKIBERT embeddings . . . . . . . . . . . . . . . . . 24 4.1.2 Enhancements in BERT variants through AKI-specific pretraining . 26 4.2 Model performances with 24h-168h prediction window . . . . . . . . 27 4.2.1 Calibration curve analysis . . . . . . . . . . . . . . . . . . . . . . . 29 4.2.2 AKI corpus pretrained BERT variant models . . . . . . . . . . . . . 30 Chapter 5 Discussions 32 5.1 Adaptability and overfitting in BERT variants from AKI corpus pre training . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 5.2 Attention visualization . . . . . . . . . . . . . . . . . . . . . . . . . 33 5.3 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 Chapter 6 Conclusion 36 References 38 Appendix A — Calibration curve with 24h-168h prediction window 45 A.1 BERT-variant Embeddings . . . . . . . . . . . . . . . . . . . . . . . 45 A.2 AKIBERT Embeddings . . . . . . . . . . . . . . . . . . . . . . . . 47 | - |
| dc.language.iso | en | - |
| dc.subject | MIMIC-III | zh_TW |
| dc.subject | BERT | zh_TW |
| dc.subject | 向量化演算法 | zh_TW |
| dc.subject | 機器學習 | zh_TW |
| dc.subject | 臨床紀錄 | zh_TW |
| dc.subject | 急性腎衰竭 | zh_TW |
| dc.subject | Machine Learning | en |
| dc.subject | Clinical Notes | en |
| dc.subject | MIMIC-III | en |
| dc.subject | Acute Kidney Injury | en |
| dc.subject | BERT | en |
| dc.subject | Em bedding Algorithm | en |
| dc.title | 利用生理數據和概述領域自適應BERT模型向量化之臨床病歷早期預測急性腎衰竭 | zh_TW |
| dc.title | Early Prediction of Acute Kidney Injury Using Physiological and Clinical Data with Domain-Adapted BERT Embeddings | en |
| dc.type | Thesis | - |
| dc.date.schoolyear | 113-1 | - |
| dc.description.degree | 碩士 | - |
| dc.contributor.oralexamcommittee | 王偉仲;陳定立;葉育彰 | zh_TW |
| dc.contributor.oralexamcommittee | Wei-chung Wang;Ting-Li Chen;Yu-Chang Yeh | en |
| dc.subject.keyword | 急性腎衰竭,MIMIC-III,臨床紀錄,機器學習,向量化演算法,BERT, | zh_TW |
| dc.subject.keyword | Acute Kidney Injury,MIMIC-III,Clinical Notes,Machine Learning,Em bedding Algorithm,BERT, | en |
| dc.relation.page | 48 | - |
| dc.identifier.doi | 10.6342/NTU202500512 | - |
| dc.rights.note | 未授權 | - |
| dc.date.accepted | 2025-02-11 | - |
| dc.contributor.author-college | 生物資源暨農學院 | - |
| dc.contributor.author-dept | 生物機電工程學系 | - |
| dc.date.embargo-lift | N/A | - |
| 顯示於系所單位: | 生物機電工程學系 | |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| ntu-113-1.pdf 未授權公開取用 | 3.55 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
