藉由機器學習模型的組合預測加護病房重症患者的臨床結果

Yu-Sheng Yu; 余育昇

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/73649

標題:	藉由機器學習模型的組合預測加護病房重症患者的臨床結果 Predicting Clinical Outcomes of Critically Ill Patients in Intensive Care Units with Combination of Machine Learning Models
作者:	Yu-Sheng Yu 余育昇
指導教授:	賴飛羆(Fei-Pei Lai)
關鍵字:	重症加護病房,預測,機器學習,死亡率,住院天數, intensive care unit,prediction,machine learning,mortality,length of stay,
出版年 :	2021
學位:	碩士
摘要:	背景：儘管已經從大量患者中收集數據進行統計分析開發出各種預測評分系統了，但是預測加護病房患者的臨床預後仍然是一個重要而艱鉅的挑戰。近年來，隨著機器學習技術的發展，各種演算法都提供了更強大的模型推斷能力並且已被用於分析此類數據。目的：本研究旨在通過結合三種機器學習模型，採用三步策略來提高重症患者對於死亡率預測的靈敏度和精確率，並將入住ICU的住院天數分為四類，以使用分類模型來預測結果而非使用典型的回歸模型。方法：從NTUH CORE資料庫中提取了4,228名重症成人患者。在死亡率模型中，資料經過前處理後使用完整資料及平衡資料經由七種機器學習算法訓練。選擇了三種具有最高靈敏度，中等精確率和最高精確率的模型。在住院天數分類模型中，我們分析了住院天數在整個資料中的分佈，並將天數分為四種類別進行標記。然後，通過機器學習的多類預測，獲得這四種類別的結果和相應的預測概率。結果：在死亡率模型中，使用最高靈敏度模型將測試資料集中843名患者中的588名分類為低度風險組，死亡率為2.6％（95％CI，1.4至4.2％），其他255名患者則往下進行下一步的預測。經過中等精確率和最高精確率模型進行處理之後，這255名患者被進一步分為具有死亡率的中度風險組（210名患者），中高度風險組（26名患者）和調整後的高風險組（19名患者），死亡率分別為29％（95％CI，23至35.7％）、73.1％（95％CI，52.2至88.4％）和94.7％（95％CI，74至99.9％）。在住院天數分類模型中，F1-score為0.604，並且住院天數小於7天的患者的比大於7天的有更好的表現。結論：這項研究表明，通過結合最高靈敏度，中等精確率和最高精確率三種模型，三步策略過程提高了重症患者30天死亡率的可預測性。 Background: Although different types of predictive scoring system have been developed from the statistical analysis of data collected for a large number of patients, prediction of clinical outcome for patients in intensive care units still remains an important and difficult challenge. In recent years, with the development of machine learning technology, various algorithms have provided more powerful model inference capabilities and has been used to analyze such data. Objective: This study aimed to use a three-step strategy to improve the sensitivity and precision in mortality prediction for critically ill patients by combining three machine learning models, and divide ICU length of stays into four classes for using the classification model to predict outcome instead of the regression model. Method: A total of 4,228 adult intensive care patients were extract from NTUH CORE database. In mortality model, the data is trained through seven machine learning algorithms with whole data and balanced data after data preprocessing. Three models were selected with the abilities of the highest sensitivity, moderate precision, and the highest precision, respectively. In LOS classification model, we analyze the distribution of LOS in the whole data and divide days into four classes for labeling. Then, through the multi-class prediction of machine learning, the results of the four classes and the corresponding probabilities are obtained. Result: In mortality model, 588 of the 843 patients in the testing dataset were classified into the low risk group with a mortality rate of 2.6% (95% CI, 1.4 to 4.2%) by using the highest sensitivity model, and other 255 patients went through the next step of prediction. After processing with moderate precision and the highest precision models, these 255 patients were further classified into the moderate risk group (210 patients), high-moderate risk group (26 patients), and adjusted high risk group (19 patients) with a mortality rate of 29% (95% CI, 23 to 35.7%), 73.1% (95% CI, 52.2 to 88.4%), and 94.7% (95% CI, 74 to 99.9%), respectively. The weighted average F1-score was 0.604 In LOS classification model, and the proportion of patients with LOS less than 7 days has better performance than those with LOS more than 7 days. Conclusion: This study revealed that a three-step strategy process enhanced the predictability of 30-day mortality of critically ill patients by combination of the highest sensitivity, moderate precision, and the highest precision models.
URI:	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/73649
DOI:	10.6342/NTU202100268
全文授權:	有償授權
顯示於系所單位：	生醫電子與資訊學研究所

文件中的檔案：

檔案	大小	格式
U0001-3001202103404100.pdf 未授權公開取用	2.15 MB	Adobe PDF

顯示文件完整紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料（如：文字、圖片、PDF）並使其易於取用。