請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/101006| 標題: | 不規則多變量時間序列中數值特徵的高效表徵方法:應用於加護醫療與慢性疾病預測 Efficient Representation of Numerical Embeddings in Irregular Multivariate Time Series: From Intensive Care to Chronic Disease Prediction |
| 作者: | 簡大容 Ta-Jung Chien |
| 指導教授: | 林澤 Che Lin |
| 關鍵字: | 深度學習,電子病歷缺失值處理多變量時間序列資料 deep learning,electronic health recordmissing valuemultivariate time series data |
| 出版年 : | 2025 |
| 學位: | 碩士 |
| 摘要: | 不規則取樣的多變量電子病歷(Electronic Health Records, EHRs)時間序列,因其取樣間隔不均、測量時間不同步以及普遍存在缺漏值,對於臨床預後模型建構帶來了重大挑戰。本研究提出一個新穎的嵌入框架,結合廣播式(broadcast-based)的架構與哈達瑪乘積數值嵌入模組(Hadamard Product Numerical Embedding, HPNE)。在此設計中,HPNE 將每一個特徵嵌入向量劃分為多個子空間,並透過與數值相關的門控機制進行調製,從而在無需填補缺漏值的情況下,對稀疏數值特徵進行細緻表徵。該方法既能保持特徵的辨識性,又能增強跨特徵交互關係的捕捉能力。另一方面,廣播機制有效整合特徵與數值的資訊維度,產生對缺漏具韌性、且可擴展至異質性臨床資料集的表徵。
在實驗評估上,我們於重症加護病房(ICU)數據集(PhysioNet 2012 與 MIMIC-III)進行住院死亡率預測,以及於慢性病數據集(台大肝細胞癌 HCC)進行長期預測。實驗結果顯示,本研究方法在不同領域下皆能穩定超越強基線模型,達到最新的預測準確度,且無需依賴額外的缺漏值填補。進一步的遷移學習實驗亦顯示,當模型先於大型 ICU 資料集進行預訓練,再應用於規模較小的資料集時,能帶來顯著的性能提升,證明 HPNE 所學得的特徵嵌入確實具有語意上的意涵,並展現跨資料集的良好泛化能力。 總體而言,本研究提出的框架在準確性與穩健性上均優於現有方法,即便在急性醫療與慢性疾病族群之間存在顯著分佈差異的情境下,仍能保持穩定效能。這些發現凸顯了高效且無需填補的數值嵌入方法,作為未來多樣化 EHR 預後建模的可靠基礎,具有相當的潛力與應用價值。 Irregularly sampled multivariate time series in electronic health records (EHRs) present major challenges for prognostic modeling due to uneven sampling, asynchronous measurements, and pervasive missing data. We propose a novel embedding framework that combines a broadcast-based architecture with a Hadamard Product Numerical Embedding (HPNE) module. HPNE partitions each feature embedding into subspaces and modulates them with value-dependent gates, enabling fine-grained representation of sparse numerical features without imputation. This preserves feature identity while enhancing the capture of cross-feature interactions. A broadcast mechanism further integrates feature and value dimensions efficiently, yielding representations that are robust to missingness and scalable to heterogeneous clinical datasets. We evaluate the model on intensive care unit (ICU) cohorts (PhysioNet 2012 and MIMIC-III) for in-hospital mortality prediction and on a chronic disease cohort (HCC, hepatocellular carcinoma) for long-term prediction. Across both domains, our method consistently outperforms strong baselines, achieving state-of-the-art predictive accuracy while requiring no explicit imputation. Transfer learning experiments further show that embeddings pretrained on a large ICU dataset improve performance on smaller cohorts, demonstrating that HPNE captures semantically meaningful feature representations with strong cross-dataset generalizability. Overall, our framework surpasses the prior approaches in both accuracy and robustness, even under distribution shifts between acute and chronic care populations. These findings highlight the potential of efficient, imputation-free embeddings as a foundation for reliable prognostic modeling across diverse EHR applications. |
| URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/101006 |
| DOI: | 10.6342/NTU202504465 |
| 全文授權: | 未授權 |
| 電子全文公開日期: | N/A |
| 顯示於系所單位: | 電信工程學研究所 |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| ntu-114-1.pdf 未授權公開取用 | 6.32 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
