請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/72680完整後設資料紀錄
| DC 欄位 | 值 | 語言 |
|---|---|---|
| dc.contributor.advisor | 賴飛羆 | |
| dc.contributor.author | Pei-Yang Ye | en |
| dc.contributor.author | 葉沛陽 | zh_TW |
| dc.date.accessioned | 2021-06-17T07:03:28Z | - |
| dc.date.available | 2024-07-31 | |
| dc.date.copyright | 2019-07-31 | |
| dc.date.issued | 2018 | |
| dc.date.submitted | 2019-07-30 | |
| dc.identifier.citation | [1] P. W. New, N. Andrianopoulos, P. Cameron, J. Olver, and J. U. Stoelwinder, 'Reducing the length of stay for acute hospital patients needing admission into inpatient rehabilitation: a multicentre study of process barriers,' Intern. Med. J., vol. 43, no. 9, pp. 1005-1011, 2013.
[2] M. Rosman, O. Rachminov, O. Segal, and G. Segal, 'Prolonged patients’ In-Hospital Waiting Period after discharge eligibility is associated with increased risk of infection, morbidity and mortality: a retrospective cohort analysis,' BMC Health Serv. Res., vol. 15, no. 1, p. 246, 2015. [3] A. Rojas‐García, S. Turner, E. Pizzo, E. Hudson, J. Thomas, and R. Raine, 'Impact and experiences of delayed discharge: A mixed‐studies systematic review,' Health Expectations, vol. 21, no. 1, pp. 41-56, 2018. [4] J. Gaughan, H. Gravelle, and L. Siciliani, 'Testing the bed‐blocking hypothesis: does nursing and care home supply reduce delayed hospital discharges?,' Health Econ., vol. 24, pp. 32-44, 2015. [5] R. R. Nasrabad, 'Introducing a new nursing care model for patients with chronic conditions,' Electronic physician, vol. 9, no. 2, p. 3794, 2017. [6] D. C. Gonçalves‐Bradley, N. A. Lannin, L. M. Clemson, I. D. Cameron, and S. Shepperd, 'Discharge planning from hospital,' Cochrane Database Syst. Rev., no. 1, 2016. [7] R. W. Mauthe, D. C. Haaf, P. Haya, and J. M. Krall, 'Predicting discharge destination of stroke patients using a mathematical model based on six items from the Functional Independence Measure,' Arch. Phys. Med. Rehabil., vol. 77, no. 1, pp. 10-13, 1996. [8] P. J. Mokler, R. Sandstrom, M. Griffin, L. Farris, and C. Jones, 'Predicting discharge destination for patients with severe motor stroke: important functional tasks,' Neurorehab. Neural Repair, vol. 14, no. 3, pp. 181-185, 2000. [9] M. Elbattah and O. Molloy, 'Using machine learning to predict length of stay and discharge destination for hip-fracture patients,' in Proceedings of SAI Intelligent Systems Conference, 2016: Springer, pp. 207-217. [10] C. A. Penn, N. S. Kamdar, D. M. Morgan, R. J. Spencer, and S. Uppal, 'Preoperatively predicting non-home discharge after surgery for gynecologic malignancy,' Gynecol. Oncol., vol. 152, no. 2, pp. 293-297, 2019. [11] N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, 'SMOTE: synthetic minority over-sampling technique,' Journal of artificial intelligence research, vol. 16, pp. 321-357, 2002. [12] H. He, Y. Bai, E. A. Garcia, and S. Li, 'ADASYN: Adaptive synthetic sampling approach for imbalanced learning,' in 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence), 2008: IEEE, pp. 1322-1328. [13] C. Drummond and R. C. Holte, 'C4. 5, class imbalance, and cost sensitivity: why under-sampling beats over-sampling,' in Workshop on learning from imbalanced datasets II, 2003, vol. 11: Citeseer, pp. 1-8. [14] C. Huang, Y. Li, C. Change Loy, and X. Tang, 'Learning deep representation for imbalanced classification,' in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 5375-5384. [15] Y.-X. Wang, D. Ramanan, and M. Hebert, 'Learning to model the tail,' in Advances in Neural Information Processing Systems, 2017, pp. 7029-7039. [16] T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean, 'Distributed representations of words and phrases and their compositionality,' in Advances in neural information processing systems, 2013, pp. 3111-3119. [17] D. Mahajan et al., 'Exploring the limits of weakly supervised pretraining,' in Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 181-196. [18] T.-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollár, 'Focal loss for dense object detection,' in Proceedings of the IEEE international conference on computer vision, 2017, pp. 2980-2988. [19] X. Zhang, Z. Fang, Y. Wen, Z. Li, and Y. Qiao, 'Range loss for deep face recognition with long-tailed training data,' in Proceedings of the IEEE International Conference on Computer Vision, 2017, pp. 5409-5418. [20] Y. Tang, Y.-Q. Zhang, N. V. Chawla, and S. Krasser, 'SVMs modeling for highly imbalanced classification,' IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), vol. 39, no. 1, pp. 281-288, 2008. [21] S. Piri, D. Delen, and T. Liu, 'A synthetic informative minority over-sampling (SIMO) algorithm leveraging support vector machine to enhance learning from imbalanced datasets,' Decision Support Systems, vol. 106, pp. 15-29, 2018. [22] C. You, C. Li, D. P. Robinson, and R. Vidal, 'A Scalable Exemplar-Based Subspace Clustering Algorithm for Class-Imbalanced Data,' in European Conference on Computer Vision, 2018: Springer, pp. 68-85. [23] G. E. Batista and M. C. Monard, 'An analysis of four missing data treatment methods for supervised learning,' ApAI, vol. 17, no. 5-6, pp. 519-533, 2003. [24] D. Ellerman, 'Logical Entropy: Introduction to Classical and Quantum Logical Information Theory,' Entropy, vol. 20, no. 9, p. 679, 2018. [25] J. MacQueen, 'Some methods for classification and analysis of multivariate observations,' in Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, 1967, vol. 1, no. 14: Oakland, CA, USA, pp. 281-297. [26] J. H. Friedman, 'Greedy function approximation: a gradient boosting machine,' Annals of statistics, pp. 1189-1232, 2001. [27] R. G. Pontius Jr and M. L. Cheuk, 'A generalized cross‐tabulation matrix to compare soft‐classified maps at multiple resolutions,' International Journal of Geographical Information Science, vol. 20, no. 1, pp. 1-30, 2006. [28] J. L. Silván-Cárdenas and L. Wang, 'Sub-pixel confusion–uncertainty matrix for assessing soft classifications,' RSEnv, vol. 112, no. 3, pp. 1081-1095, 2008. [29] H. Lewis and M. Brown, 'A generalized confusion matrix for assessing area estimates from remotely sensed data,' IJRS, vol. 22, no. 16, pp. 3223-3235, 2001. [30] D. M. Powers, 'Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation,' 2011. [31] M. Sokolova and G. Lapalme, 'A systematic analysis of performance measures for classification tasks,' Inf. Process. Manag., vol. 45, no. 4, pp. 427-437, 2009. [32] V. Van Asch, 'Macro-and micro-averaged evaluation measures [[basic draft]],' Belgium: CLiPS, pp. 1-27, 2013. [33] J. Akosa, 'Predictive accuracy: A misleading performance measure for highly imbalanced data,' in Proceedings of the SAS Global Forum, 2017. [34] C. Goutte and E. Gaussier, 'A probabilistic interpretation of precision, recall and F-score, with implication for evaluation,' in European Conference on Information Retrieval, 2005: Springer, pp. 345-359. [35] S. M. Lundberg and S.-I. Lee, 'A unified approach to interpreting model predictions,' in Advances in Neural Information Processing Systems, 2017, pp. 4765-4774. [36] S. M. Lundberg, G. G. Erion, and S.-I. Lee, 'Consistent individualized feature attribution for tree ensembles,' arXiv preprint arXiv:1802.03888, 2018. [37] S. Landers et al., 'The future of home health care: A strategic framework for optimizing value,' Home health care management & practice, vol. 28, no. 4, pp. 262-278, 2016. [38] T. Ng, J. Stone, and C. Harrington, 'Medicaid home and community-based services: How consumer access is restricted by state policies,' J. Aging Soc. Policy, vol. 27, no. 1, pp. 21-46, 2015. | |
| dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/72680 | - |
| dc.description.abstract | 出院準備服務是醫院端提供安排後續照護的橋樑,如果銜接的夠早夠好,可以減少因安排後續歸宿而拖延住院日數,也因安排得當可減少出院後再入院的比率,同時增加病人與家屬對醫療團隊的滿意度,也可以增進病人出院後返家/入住機構的照顧品質。在決定出院目的地時,常常因需要醫護人員、病人與家屬三方的多次溝通而拖延到出院時間。
在這項研究中,我們希望能建立一個機器學習的分類模型,能夠藉由輸入病人的資料以快速決定病人合適的出院目的地。我們採取的病人資料包括病人的基本資料、入院護理評估、出院照護摘要、入出院時部門、跌倒危險性分數、住院期間各項治療次數(物理、職能、心理、語言)以及診斷碼。在預測出院目的地中,我們定義了三個類別(轉院、轉機構、回家)。由於我們面臨到很嚴重的資料不平衡問題,我們採取了集成方法(Ensemble method)來解決它。而考慮到現實情況,有些病人即使需要轉院或轉機構卻執意想要回家。因此,我們利用分群方法(K-means)來移除回家類別中的異常值(Outlier)。最後,我們利用梯度提升決策樹(Gradient Boosted Decision Tree)並使用混淆矩陣以及自定義一個weighted macro f5 score來評估模型的預測效果。 在預測病人出院目的地中,轉院類別的準確率達到65%,轉機構類別的準確率達到89%,回家類別的準確率達到83%。整體上來看,我們自訂的weighted macro f5 score達到0.823。還找出了一些重要的特徵,例如巴氏量表分數、住院期間、年齡、出院後主要照顧者。 | zh_TW |
| dc.description.abstract | Discharge preparation service is a bridge from the hospital to provide follow-up care. If the service is executed early enough, it can reduce the extra length of hospital stay caused by arranging follow-up, then reduce the re-hospitalization rate, increase the patient satisfaction in the hospital and medical staff, and even improve the quality of follow-up care after discharge.
In this study, we hope to establish a classification model for machine learning that can quickly determine the appropriate discharge destination for patients. The patient information we took included the patient's basic information, admission note, discharge note, department, falls-risk score, the number of therapies during the hospital stay (physical, occupational, psychological, speech) and diagnosis code. In predicting discharge destinations, we define three classes (“Hospital,” “Facility,” and “Home”). Since we faced a very serious data imbalance problem, we had adopted the Ensemble method to solve it. Considering the reality, some patients insist on going home even if they need to go to hospitals or facilities. Therefore, we use K-means to remove outliers from the class “Home.” Finally, we adopted the Gradient Boosted Decision Tree algorithm. And, we used the confusion matrix and a weighted macro f5 score (〖F(5)-Score〗_am) to evaluate the model's predictions. In the result, the accuracy rate of the class “Hospital” reached 65%, the accuracy rate of the class “Facility” reached 89%, and the accuracy rate of the class “Home” reached 83%. Overall, 〖F(5)-Score〗_am reached 0.823. | en |
| dc.description.provenance | Made available in DSpace on 2021-06-17T07:03:28Z (GMT). No. of bitstreams: 1 ntu-107-R06922134-1.pdf: 2264226 bytes, checksum: 3d2a10517ddc96bffd7298d47587643f (MD5) Previous issue date: 2018 | en |
| dc.description.tableofcontents | 誌謝 i
中文摘要 ii ABSTRACT iii CONTENTS iv LIST OF FIGURES vii LIST OF TABLES viii Chapter 1 Introduction 1 Chapter 2 Related Work 3 2.1 Discharge Destination Prediction 3 2.2 Data Imbalance Problem 4 2.2.1 Sampling Methods 4 2.2.2 Cost-Sensitive Methods 5 Chapter 3 Dataset 6 3.1 Description 6 3.2 Discharge Destination 7 3.3 Patient Features 8 3.3.1 Patient Basic Information 8 3.3.2 Admission Note 9 3.3.3 Discharge Note 9 3.3.4 Department 10 3.3.5 Fall Risk Assessment 11 3.3.6 Therapy in Hospitalization 11 3.3.7 Diagnosis Code 11 Chapter 4 Methods 14 4.1 Workflow 14 4.2 Data Preprocessing 15 4.2.1 One-hot Encoding 15 4.2.2 Categorical Features 16 4.2.3 Missing Value 17 4.2.4 Normalization 18 4.3 Feature Selection 19 4.3.1 Gini Impurity 19 4.3.2 Gini Impurity for Feature Importance 19 4.4 Learning Algorithms 20 4.4.1 K-means 20 4.4.2 Gradient Boosted Decision Tree for Classification 21 4.5 Data Anomaly 22 4.6 Dealing with Data Imbalances 24 4.6.1 Naïve Random Over-sampling 24 4.6.2 Ensemble Method 24 4.7 Evaluation Metrics 25 4.7.1 Common Metrics 26 4.7.2 Custom Metric - Adjusted Macro F Measure 26 Chapter 5 Results and Discussion 30 5.1 Prediction without ICD Diagnosis 30 5.1.1 Prediction Performance 30 5.1.2 Feature Importance 32 5.2 Adding ICD Diagnosis 35 5.2.1 Prediction Performance 36 5.3 Imbalance Data 37 5.4 Outlier Removal 39 5.4.1 Comparison of Results 39 5.4.2 Prediction of Outlier 39 5.5 Other Machine Learning Methods 40 Chapter 6 Conclusion 42 REFERENCE 44 | |
| dc.language.iso | en | |
| dc.subject | 多分類問題 | zh_TW |
| dc.subject | 機器學習 | zh_TW |
| dc.subject | 資料不平衡 | zh_TW |
| dc.subject | 異常值 | zh_TW |
| dc.subject | 出院目的地 | zh_TW |
| dc.subject | 重要特徵 | zh_TW |
| dc.subject | feature importance | en |
| dc.subject | machine learning | en |
| dc.subject | multi-class classification | en |
| dc.subject | data imbalance problem | en |
| dc.subject | outlier removal | en |
| dc.subject | discharge destination | en |
| dc.title | 利用機器學習預測住院病人出院之目的地 | zh_TW |
| dc.title | Discharge Destination Prediction for Hospitalized Patients Using Machine Learning | en |
| dc.type | Thesis | |
| dc.date.schoolyear | 107-2 | |
| dc.description.degree | 碩士 | |
| dc.contributor.oralexamcommittee | 曹昭懿,莊仁輝,胡文郁,成佳憲 | |
| dc.subject.keyword | 出院目的地,機器學習,多分類問題,資料不平衡,異常值,重要特徵, | zh_TW |
| dc.subject.keyword | discharge destination,machine learning,multi-class classification,data imbalance problem,outlier removal,feature importance, | en |
| dc.relation.page | 48 | |
| dc.identifier.doi | 10.6342/NTU201902183 | |
| dc.rights.note | 有償授權 | |
| dc.date.accepted | 2019-07-30 | |
| dc.contributor.author-college | 電機資訊學院 | zh_TW |
| dc.contributor.author-dept | 資訊工程學研究所 | zh_TW |
| 顯示於系所單位: | 資訊工程學系 | |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| ntu-107-1.pdf 未授權公開取用 | 2.21 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
