機器學習與深度學習於心臟病發作預測之比較分析

莫子佩; Muhammad Aulia Rahman

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/101778

標題:	機器學習與深度學習於心臟病發作預測之比較分析 Comparative Analysis of Machine Learning and Deep Learning Approaches for Heart Attack Prediction
作者:	莫子佩 Muhammad Aulia Rahman
指導教授:	廖世偉 Shih-Wei Liao
關鍵字:	心肌梗塞預測,心血管疾病機器學習深度學習集成學習臨床決策支援ROC-AUC風險預測 Heart Attack Prediction,Cardiovascular DiseaseMachine LearningDeep LearningEnsemble LearningClinical Decision SupportROC-AUCRisk Prediction
出版年 :	2026
學位:	碩士
摘要:	心血管疾病仍然是全球主要的死亡原因，其中心肌梗塞（心臟病發作）因其突發性及可能致命的特性，對臨床及公共衛生構成特別嚴重的挑戰。因此，準確且及時地預測心肌梗塞風險對於有效的預防與早期介入至關重要。近年來，機器學習（ML）與深度學習（DL）技術已逐漸被應用於心血管風險預測；然而，針對其相對效能，尤其是在結構化臨床資料上的應用，目前仍缺乏共識。本研究針對心肌梗塞預測，對多種機器學習及深度學習模型進行系統性的比較分析，所使用的結構化臨床資料集包含918位病患的基本人口學資料、臨床檢查資訊及生活型態特徵。傳統的機器學習模型包括羅吉斯迴歸（Logistic Regression）、支持向量機（Support Vector Machines）、隨機森林（Random Forest）、梯度提升（Gradient Boosting）及XGBoost，與深度學習模型，包括多層感知器（MLP）、基於Keras的深度神經網路（DNN）以及TabNet，進行比較。所有模型皆在相同實驗條件下訓練與評估，採用統一的資料前處理流程、分層的訓練–測試集劃分，以及標準化的評估指標。模型效能評估指標包括準確率（Accuracy）、精確率（Precision）、召回率（Recall）、F1分數以及受試者工作特徵曲線下面積（ROC-AUC），其中特別強調召回率及ROC-AUC，因其在臨床上對於降低漏診具有高度相關性。實驗結果顯示，傳統機器學習模型，特別是集成方法，在此結構化資料集上的表現普遍優於深度學習模型。隨機森林取得最高的整體準確率（90.22%）與F1分數（91.35%），同時維持高召回率（93.14%）及具有競爭力的ROC-AUC（0.933），顯示其能有效辨識高風險病患。使用RBF核的支持向量機則取得最高ROC-AUC（0.949），反映出其在辨別高風險與非高風險病患之間的能力最佳。相比之下，深度學習模型表現雖然具有競爭力，但略低於傳統機器學習，其中Keras DNN達到89.67%的準確率與0.949 ROC-AUC，MLP稍低，而TabNet表現較弱（82.61%準確率、0.912 ROC-AUC），可能受限於資料量較小以及特徵豐富度不足。這些結果顯示，對於中等規模的結構化臨床資料，集成機器學習模型比深度學習模型更有效且實用。除了預測效能之外，像羅吉斯迴歸與隨機森林這類具可解釋性的模型，能讓臨床人員理解各特徵對預測結果的貢獻，對臨床採用至關重要。總體而言，本研究強調模型選擇應與資料特性及臨床需求相匹配，提供實證基礎的建議，以協助開發可靠的心肌梗塞預測系統，應用於醫療決策支援。 Cardiovascular diseases remain a leading cause of global mortality, with heart attacks presenting a particularly severe clinical and public health challenge due to their sudden onset and potentially fatal outcomes. Accurate and timely prediction of heart attack risk is therefore essential for effective prevention and early intervention. In recent years, both machine learning (ML) and deep learning (DL) techniques have been increasingly applied to cardiovascular risk prediction; however, there remains limited consensus regarding their relative effectiveness, particularly when applied to structured clinical datasets. This study presents a systematic comparative analysis of selected machine learning and deep learning models for heart attack prediction using a structured clinical dataset comprising 918 patient records with demographic, clinical, and lifestyle-related features. Traditional ML models including Logistic Regression, Support Vector Machines, Random Forest, Gradient Boosting, and XGBoost were compared with DL approaches, namely a multilayer perceptron (MLP), a Keras-based deep neural network (DNN), and TabNet. All models were trained and evaluated under identical experimental conditions, employing a unified preprocessing pipeline, stratified train–test split, and standardized evaluation metrics. Model performance was assessed using accuracy, precision, recall, F1-score, and area under the receiver operating characteristic curve (ROC-AUC), with particular emphasis on recall and ROC-AUC given their clinical relevance in minimizing missed diagnoses. The experimental results demonstrate that traditional machine learning models, particularly ensemble-based methods, generally outperformed deep learning approaches on this structured dataset. Random Forest achieved the highest overall accuracy (90.22%) and F1-score (91.35%), while maintaining strong recall (93.14%) and a competitive ROC-AUC (0.933), demonstrating its ability to correctly identify at-risk patients. SVM with an RBF kernel attained the highest ROC-AUC (0.949), reflecting superior discriminative capability between patients at risk and those not at risk. In contrast, deep learning models exhibited comparable but slightly lower performance, with the Keras DNN achieving 89.67% accuracy and 0.949 ROC-AUC, the MLP slightly lower, and TabNet performing substantially weaker (82.61% accuracy, 0.912 ROC-AUC), likely due to the relatively small dataset size and limited feature richness. These findings suggest that, for structured tabular clinical data of moderate size, ensemble machine learning models offer a more effective and practical solution than deep learning approaches. Beyond predictive performance, interpretable models such as Logistic Regression and Random Forest allow clinicians to understand how individual features contribute to predictions, which is crucial for clinical adoption. Overall, this study underscores the importance of aligning model selection with both data characteristics and clinical requirements, providing evidence-based guidance for the development of reliable heart attack prediction systems in healthcare decision support contexts.
URI:	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/101778
DOI:	10.6342/NTU202600221
全文授權:	同意授權(限校園內公開)
電子全文公開日期:	2031-01-21
顯示於系所單位：	智慧醫療與健康資訊碩士學位學程

文件中的檔案：

檔案	大小	格式
ntu-114-1.pdf 未授權公開取用	2.41 MB	Adobe PDF	檢視/開啟

顯示文件完整紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料（如：文字、圖片、PDF）並使其易於取用。