機器學習與深度學習於心臟病發作預測之比較分析

莫子佩; Muhammad Aulia Rahman

Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/101778

Title:	機器學習與深度學習於心臟病發作預測之比較分析 Comparative Analysis of Machine Learning and Deep Learning Approaches for Heart Attack Prediction
Authors:	莫子佩 Muhammad Aulia Rahman
Advisor:	廖世偉 Shih-Wei Liao
Keyword:	心肌梗塞預測,心血管疾病機器學習深度學習集成學習臨床決策支援ROC-AUC風險預測 Heart Attack Prediction,Cardiovascular DiseaseMachine LearningDeep LearningEnsemble LearningClinical Decision SupportROC-AUCRisk Prediction
Publication Year :	2026
Degree:	碩士
Abstract:	心血管疾病仍然是全球主要的死亡原因，其中心肌梗塞（心臟病發作）因其突發性及可能致命的特性，對臨床及公共衛生構成特別嚴重的挑戰。因此，準確且及時地預測心肌梗塞風險對於有效的預防與早期介入至關重要。近年來，機器學習（ML）與深度學習（DL）技術已逐漸被應用於心血管風險預測；然而，針對其相對效能，尤其是在結構化臨床資料上的應用，目前仍缺乏共識。本研究針對心肌梗塞預測，對多種機器學習及深度學習模型進行系統性的比較分析，所使用的結構化臨床資料集包含918位病患的基本人口學資料、臨床檢查資訊及生活型態特徵。傳統的機器學習模型包括羅吉斯迴歸（Logistic Regression）、支持向量機（Support Vector Machines）、隨機森林（Random Forest）、梯度提升（Gradient Boosting）及XGBoost，與深度學習模型，包括多層感知器（MLP）、基於Keras的深度神經網路（DNN）以及TabNet，進行比較。所有模型皆在相同實驗條件下訓練與評估，採用統一的資料前處理流程、分層的訓練–測試集劃分，以及標準化的評估指標。模型效能評估指標包括準確率（Accuracy）、精確率（Precision）、召回率（Recall）、F1分數以及受試者工作特徵曲線下面積（ROC-AUC），其中特別強調召回率及ROC-AUC，因其在臨床上對於降低漏診具有高度相關性。實驗結果顯示，傳統機器學習模型，特別是集成方法，在此結構化資料集上的表現普遍優於深度學習模型。隨機森林取得最高的整體準確率（90.22%）與F1分數（91.35%），同時維持高召回率（93.14%）及具有競爭力的ROC-AUC（0.933），顯示其能有效辨識高風險病患。使用RBF核的支持向量機則取得最高ROC-AUC（0.949），反映出其在辨別高風險與非高風險病患之間的能力最佳。相比之下，深度學習模型表現雖然具有競爭力，但略低於傳統機器學習，其中Keras DNN達到89.67%的準確率與0.949 ROC-AUC，MLP稍低，而TabNet表現較弱（82.61%準確率、0.912 ROC-AUC），可能受限於資料量較小以及特徵豐富度不足。這些結果顯示，對於中等規模的結構化臨床資料，集成機器學習模型比深度學習模型更有效且實用。除了預測效能之外，像羅吉斯迴歸與隨機森林這類具可解釋性的模型，能讓臨床人員理解各特徵對預測結果的貢獻，對臨床採用至關重要。總體而言，本研究強調模型選擇應與資料特性及臨床需求相匹配，提供實證基礎的建議，以協助開發可靠的心肌梗塞預測系統，應用於醫療決策支援。 Cardiovascular diseases remain a leading cause of global mortality, with heart attacks presenting a particularly severe clinical and public health challenge due to their sudden onset and potentially fatal outcomes. Accurate and timely prediction of heart attack risk is therefore essential for effective prevention and early intervention. In recent years, both machine learning (ML) and deep learning (DL) techniques have been increasingly applied to cardiovascular risk prediction; however, there remains limited consensus regarding their relative effectiveness, particularly when applied to structured clinical datasets. This study presents a systematic comparative analysis of selected machine learning and deep learning models for heart attack prediction using a structured clinical dataset comprising 918 patient records with demographic, clinical, and lifestyle-related features. Traditional ML models including Logistic Regression, Support Vector Machines, Random Forest, Gradient Boosting, and XGBoost were compared with DL approaches, namely a multilayer perceptron (MLP), a Keras-based deep neural network (DNN), and TabNet. All models were trained and evaluated under identical experimental conditions, employing a unified preprocessing pipeline, stratified train–test split, and standardized evaluation metrics. Model performance was assessed using accuracy, precision, recall, F1-score, and area under the receiver operating characteristic curve (ROC-AUC), with particular emphasis on recall and ROC-AUC given their clinical relevance in minimizing missed diagnoses. The experimental results demonstrate that traditional machine learning models, particularly ensemble-based methods, generally outperformed deep learning approaches on this structured dataset. Random Forest achieved the highest overall accuracy (90.22%) and F1-score (91.35%), while maintaining strong recall (93.14%) and a competitive ROC-AUC (0.933), demonstrating its ability to correctly identify at-risk patients. SVM with an RBF kernel attained the highest ROC-AUC (0.949), reflecting superior discriminative capability between patients at risk and those not at risk. In contrast, deep learning models exhibited comparable but slightly lower performance, with the Keras DNN achieving 89.67% accuracy and 0.949 ROC-AUC, the MLP slightly lower, and TabNet performing substantially weaker (82.61% accuracy, 0.912 ROC-AUC), likely due to the relatively small dataset size and limited feature richness. These findings suggest that, for structured tabular clinical data of moderate size, ensemble machine learning models offer a more effective and practical solution than deep learning approaches. Beyond predictive performance, interpretable models such as Logistic Regression and Random Forest allow clinicians to understand how individual features contribute to predictions, which is crucial for clinical adoption. Overall, this study underscores the importance of aligning model selection with both data characteristics and clinical requirements, providing evidence-based guidance for the development of reliable heart attack prediction systems in healthcare decision support contexts.
URI:	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/101778
DOI:	10.6342/NTU202600221
Fulltext Rights:	同意授權(限校園內公開)
metadata.dc.date.embargo-lift:	2031-01-21
Appears in Collections:	智慧醫療與健康資訊碩士學位學程

Files in This Item:

File	Size	Format
ntu-114-1.pdf Restricted Access	2.41 MB	Adobe PDF	View/Open

Show full item record

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets