Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 電信工程學研究所
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/92205
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor林澤zh_TW
dc.contributor.advisorChe Linen
dc.contributor.author黃俊愷zh_TW
dc.contributor.authorChun-Kai Huangen
dc.date.accessioned2024-03-08T16:17:46Z-
dc.date.available2024-03-09-
dc.date.copyright2024-03-08-
dc.date.issued2024-
dc.date.submitted2024-02-17-
dc.identifier.citation[1] A. Sagheer and M. Kotb, “Time series forecasting of petroleum production using deep LSTM recurrent networks,” Neurocomputing, vol. 323, pp. 203–213, Jan. 2019. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0925231218311639
[2] Z. Che, S. Purushotham, K. Cho, D. Sontag, and Y. Liu, “Recurrent neural networks for multivariate time series with missing values,” Scientific reports, vol. 8, no. 1, p.6085, 2018.
[3] Y. Liang, Y. Xia, S. Ke, Y. Wang, Q. Wen, J. Zhang, Y. Zheng, and R. Zimmermann, “AirFormer: Predicting Nationwide Air Quality in China with Transformers,” Nov. 2022, arXiv:2211.15979 [cs, eess]. [Online]. Available: http://arxiv.org/abs/2211.15979
[4] G. Zerveas, S. Jayaraman, D. Patel, A. Bhamidipaty, and C. Eickhoff, “A Transformer-based Framework for Multivariate Time Series Representation Learning,” in Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. Virtual Event Singapore: ACM, Aug. 2021, pp. 2114–2124. [Online]. Available: https://dl.acm.org/doi/10.1145/3447548.3467401
[5] T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient Estimation of Word Representations in Vector Space,” Sep. 2013, arXiv:1301.3781 [cs]. [Online]. Available: http://arxiv.org/abs/1301.3781
[6] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, “Attention Is All You Need,” Dec. 2017, arXiv:1706.03762 [cs]. [Online]. Available: http://arxiv.org/abs/1706.03762
[7] S. Abnar and W. Zuidema, “Quantifying Attention Flow in Transformers,” May 2020, arXiv:2005.00928 [cs]. [Online]. Available: http://arxiv.org/abs/2005.00928
[8] J. Ma, Z. Shou, A. Zareian, H. Mansour, A. Vetro, and S.-F. Chang, “CDSA: Cross-Dimensional Self-Attention for Multivariate, Geo-tagged Time Series Imputation,” Aug. 2019, arXiv:1905.09904 [cs, stat]. [Online]. Available: http://arxiv.org/abs/1905.09904
[9] J. Zhong, N. Gui, and W. Ye, “Data Imputation with Iterative Graph Reconstruction,” Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, no. 9, pp.11 399–11 407, Jun. 2023, number: 9. [Online]. Available: https://ojs.aaai.org/index.php/AAAI/article/view/26348
[10] N. Wu, B. Green, X. Ben, and S. O’Banion, “Deep Transformer Models for Time Series Forecasting: The Influenza Prevalencease,” Jan. 2020, arXiv:2001.08317 [cs, stat]. [Online]. Available: http://arxiv.org/abs/2001.08317
[11] Y. Zhang and J. Yan, “CROSSFORMER: TRANSFORMER UTILIZING CROSS-DIMENSION DEPENDENCY FOR MULTIVARIATE TIME SERIES FORECASTING,” 2023.
[12] R. Zuo, G. Li, B. Choi, S. S. Bhowmick, D. N.-y. Mah, and G. L. H. Wong, “SVP-T: A Shape-Level Variable-Position Transformer for Multivariate Time Series Classification,” Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, no. 9, pp. 11 497–11 505, Jun. 2023, number: 9. [Online]. Available: https://ojs.aaai.org/index.php/AAAI/article/view/26359
[13] J. Grigsby, Z. Wang, N. Nguyen, and Y. Qi, “Long-Range Transformers for Dynamic Spatiotemporal Forecasting,” Mar. 2023, arXiv:2109.12218 [cs, stat]. [Online]. Available: http://arxiv.org/abs/2109.12218
[14] Z. MING-ZHE, “Deep sti: Deep stochastic time-series imputation on electronic medical records,” no. 2021 年, pp. 1–54, Jan. 2021, publisher: 國立清華大學. [Online]. Available: https://www.airitilibrary.com/Publication/alDetailedMesh1?DocID=U0016-0203202211552296
[15] T.-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollár, “Focal Loss for Dense Object Detection,” Feb. 2018, arXiv:1708.02002 [cs].[Online]. Available: http://arxiv.org/abs/1708.02002
[16] J. Chung, C. Gulcehre, K. Cho, and Y. Bengio, “Empirical evaluation of gated recurrent neural networks on sequence modeling,” arXiv preprint arXiv:1412.3555, 2014.
[17] S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural computation, vol. 9, no. 8, pp. 1735–1780, 1997.
[18] Y.-A. Wang and Y.-N. Chen, “What Do Position Embeddings Learn? An Empirical Study of Pre-Trained Language Model Positional Encoding,” in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)Online: Association for Computational Linguistics, 2020, pp. 6840–6849. [Online]. Available: https://www.aclweb.org/anthology/2020.emnlp-main.555
[19] J. L. Ba, J. R. Kiros, and G. E. Hinton, “Layer Normalization,” Jul. 2016, arXiv:1607.06450 [cs, stat]. [Online]. Available: http://arxiv.org/abs/1607.06450
[20] D. Hendrycks and K. Gimpel, “Gaussian Error Linear Units (GELUs),” Jun. 2016, arXiv:1606.08415 [cs]. [Online]. Available: http://arxiv.org/abs/1606.08415
[21] F. E. HARRELL Jr., K. L. Lee, and D. B. Mark, “Multivariable Prognostic Models: Issues in Developing Models, Evaluating Assumptions and Adequacy, and Measuring and Reducing Errors,” Statistics in Medicine, vol. 15, no. 4, pp. 361–387, 1996, _eprint: https:// onlinelibrary.wiley.com/ doi/ pdf/ 10.1002/%28SICI%291097-0258%2819960229%2915%3A4%3C361%3A%3AAID-SIM168%3E3.0.CO%3B2-4. [Online]. Available: https://onlinelibrary.wiley.com/doi/abs/10.1002/%28SICI%291097-0258%2819960229%2915%3A4%3C361%3A%3AAID-SIM168%3E3.0.CO%3B2-4
[22] A. L. Goldberger, L. A. Amaral, L. Glass, J. M. Hausdorff, P. C. Ivanov, R. G. Mark, J. E. Mietus, G. B. Moody, C. K. Peng, and H. E. Stanley, “PhysioBank, PhysioToolkit, and PhysioNet: components of a new research resource for complex physiologic signals,” Circulation, vol. 101, no. 23, pp. E215–220, Jun. 2000.
[23] M. Horn, M. Moor, C. Bock, B. Rieck, and K. Borgwardt, “Set Functions for Time Series,” Sep. 2020, arXiv:1909.12064 [cs, stat] version: 3. [Online]. Available: http://arxiv.org/abs/1909.12064
[24] H. Harutyunyan, H. Khachatrian, D. C. Kale, G. Ver Steeg, and A. Galstyan, “Multitask learning and benchmarking with clinical time series data,” Scientific Data, vol. 6, no. 1, p. 96, Jun. 2019, number: 1 Publisher: Nature Publishing Group. [Online]. Available: https://www.nature.com/articles/s41597-019-0103-9
[25] L. Breiman, “Random forests,” Machine learning, vol. 45, pp. 5–32, 2001.
[26] T. Chen and C. Guestrin, “Xgboost: A scalable tree boosting system,” in Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, 2016, pp. 785–794.
[27] D. P. Kingma and J. Ba, “Adam: A Method for Stochastic Optimization,” Jan. 2017, arXiv:1412.6980 [cs]. [Online]. Available: http://arxiv.org/abs/1412.6980
[28] T. Saito and M. Rehmsmeier, “The Precision-Recall Plot Is More Informative than the ROC Plot When Evaluating Binary Classifiers on Imbalanced Datasets,” PLOS ONE, vol. 10, no. 3, p. e0118432, Mar. 2015, publisher: Public Library of Science. [Online]. Available: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0118432
[29] Q. Pang, K. Qu, J.-Y. Zhang, S.-D. Song, S.-S. Liu, M.-H. Tai, H.-C. Liu, and C. Liu, “The Prognostic Value of Platelet Count in Patients With Hepatocellular Carcinoma,” Medicine, vol. 94, no. 37, p. e1431, Sep. 2015. [Online]. Available: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4635796/
[30] B. I. Carr and V. Guerra, “Serum Albumin Levels in Relation to Tumor Parameters in Hepatocellular Carcinoma Patients,” The International Journal of Biological Markers, vol. 32, no. 4, pp. 391–396, Oct. 2017, publisher: SAGE Publications Ltd STM. [Online]. Available: https://doi.org/10.5301/ijbm.5000300
[31] M.-F. Yuen, Y. Tanaka, D. Y.-T. Fong, J. Fung, D. K.-H. Wong, J. C.-H. Yuen, D. Y.-K. But, A. O.-O. Chan, B. C.-Y. Wong, M. Mizokami, and C.-L. Lai, “Independent risk factors and predictive score for the development of hepatocellular carcinoma in chronic hepatitis B,” Journal of Hepatology, vol. 50, no. 1, pp. 80–88, Jan. 2009. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0168827808005655
[32] Y. Bengio and Y. LeCun, “Scaling learning algorithms towards AI,” in Large Scale Kernel Machines. MIT Press, 2007.
[33] G. E. Hinton, S. Osindero, and Y. W. Teh, “A fast learning algorithm for deep belief nets,” Neural Computation, vol. 18, pp. 1527–1554, 2006.
[34] I. Goodfellow, Y. Bengio, A. Courville, and Y. Bengio, Deep learning. MIT Press, 2016, vol. 1.
[35] J. Xu, H. Wu, J. Wang, and M. Long, “Anomaly Transformer: Time Series Anomaly Detection with Association Discrepancy,” Jun. 2022, arXiv:2110.02642 [cs]. [Online]. Available: http://arxiv.org/abs/2110.02642
[36] O. Ronneberger, P. Fischer, and T. Brox, “U-Net: Convolutional Networks for Biomedical Image Segmentation,” May 2015, arXiv:1505.04597 [cs]. [Online]. Available: http://arxiv.org/abs/1505.04597
[37] H. Shi, P. Xie, Z. Hu, M. Zhang, and E. P. Xing, “Towards Automated ICD Coding Using Deep Learning,” in Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2018, pp. 1066–1076, arXiv:1711.04075 [cs]. [Online]. Available: http://arxiv.org/abs/1711.04075
[38] C.-S. Wei, “A BCLC staging system for hepatocellular carcinoma using Ensemble Learning and Multi-phase abdominal CT.” Thesis, Aug. 2023, accepted: 2023-08-16T17:13:26Z. [Online]. Available: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/89122
-
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/92205-
dc.description.abstract多變量時間序列數據包含在許多領域,例如能源監測、環境和醫療保健。有許多基於深度學習的方法試圖學習多元時間序列數據的有效表示法。然而,這些工作通常以同一個時間戳的所有變量當作模型的輸入,這導致了模型容易強調變量之間的時間關係。在這篇論文中,我們關注的資料為電子健康記錄數據。這種多元時間序列數據由於不規則採樣和異步測量而導致了非常可觀的缺失值。這種不規則的多變量時間序列數據對有效的表徵學習提出了挑戰。為了應對上述挑戰,我們提出了“可擴展數值嵌入”。可擴展數值嵌入是基於「值作為token」的概念,獨立地將每個值嵌入為輸入模型的向量。使用可擴展數值嵌入,特徵提取器不僅可以學習變量之間的時間關係,更有機會學習到不同變量之間的關係。我們進一步結合可擴展數值嵌入與Transformer encoder來構成TranSCANE。透過Transformer encoder的屏蔽機制和可擴展數值嵌入的幫助,TranSCANE能夠避免關注缺失值。也就是說,TranSCANE針對碎片化多變量時間序列數據而言,可以不需要對缺失值補值。此外,我們還提出了專門為TranSCANE設計的改良型滾動注意力計算,提高了我們模型的可解釋性。實驗結果表明,TranSCANE在三個不同的電子健康紀錄數據集上有最佳的表現。TranSCANE具有學習變量之間更多特徵關係的潛力,以及基於它不需補值而對不同插補的強健性。有了這些結果,我們相信TranSCANE是一個強大的在不規則多元時間序列數據之表示學習模型。zh_TW
dc.description.abstractMultivariate time series (MTS) data often arise in numerous domains, such as energy monitoring, environment, and healthcare. Numerous deep-learning-based methods have been proposed that attempt to learn an effective representation of MTS data. However, these works commonly take variables at the same timestamp as model inputs, emphasizing only the temporal relation. This study focuses on electronic health records (EHR) data, which is full of missing values due to irregular sampling and asynchronous measurement. This irregular MTS data poses additional challenges for effective representation learning. To tackle the challenges mentioned above, we propose “SCAlable Numerical Embedding” (SCANE). SCANE is based on the concept of “value as a token” and embeds each value independently. With SCANE, the feature extractor can learn not only the temporal but also the feature-wise relation between variables. We further integrate $\\mathrm{SCANE}$ with the Transformer encoder to form TranSCANE. With the masking mechanism and SCANE, TranSCANE can avoid paying unnecessary attention to missing values. That is, TranSCANE is an imputation-free model for fragmentary MTS data. Moreover, we propose the revised rollout attention toiled for TranSCANE. It improves the interpretability of our model. Experiment results show TranSCANE performs best on three different EHR datasets. It has the potential to learn more feature-wise relations between variables. Furthermore, it is robust against different imputations due to its "imputation-free" nature. As a result, we believe TranSCANE is a powerful representation learning model for irregular MTS data.en
dc.description.provenanceSubmitted by admin ntu (admin@lib.ntu.edu.tw) on 2024-03-08T16:17:46Z
No. of bitstreams: 0
en
dc.description.provenanceMade available in DSpace on 2024-03-08T16:17:46Z (GMT). No. of bitstreams: 0en
dc.description.tableofcontentsContents Page
口試委員審定書 i
誌謝 ii
摘要 iii
Abstract iv
Contents vi
List of Figures ix
Chapter 1 Introduction 1
Chapter 2 Related Work 5
2.1 Methods for Data with Missing Values . . . . . . . . . . . . . . . . . 5
2.2 Deep Learning Models for MTS and EHR Data . . . . . . . . . . . . 6
Chapter 3 Methodology 8
3.1 Summarization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.2 Scalable Numerical Embedding . . . . . . . . . . . . . . . . . . . . 11
3.3 Positional Encoding . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.4 Transformer Encoder . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.4.1 Self-attention Mechanism . . . . . . . . . . . . . . . . . . . 14
3.4.2 Self-attention Mechanism with Scalable Numerical Embedding 16
3.5 The Overall Architecture of The Classification Model . . . . . . . . 17
3.6 Focal Loss and the Optimization Target . . . . . . . . . . . . . . . . 18
3.7 Revised Rollout Attention . . . . . . . . . . . . . . . . . . . . . . . 19
3.7.1 Rollout Attention . . . . . . . . . . . . . . . . . . . . . . . . 19
3.7.2 Revised Rollout Attention . . . . . . . . . . . . . . . . . . . 20
3.8 Evaluation Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.8.1 Accuracy and Confusion Matrix . . . . . . . . . . . . . . . . 21
3.8.2 Area under Receiver Operating Characteristic Curve (AUROC) 23
3.8.3 Area under Precision-Recall Curve (AUPRC) . . . . . . . . . 26
3.8.4 Concordance Index (c-index) . . . . . . . . . . . . . . . . . . 27
Chapter 4 Experiment 28
4.1 Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
4.1.1 National Taiwan University Hepatocellular Carcinoma Dataset (HCC) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
4.1.2 PhysioNet2012 Dataset (P12) . . . . . . . . . . . . . . . . . 29
4.1.3 MIMIC-III Dataset (MI3) . . . . . . . . . . . . . . . . . . . . 30
4.2 Benchmarks and Model Architecture . . . . . . . . . . . . . . . . . 32
4.3 Experiment Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4.3.1 Platform Detail . . . . . . . . . . . . . . . . . . . . . . . . . 37
4.3.2 Subgroup Definition and Description . . . . . . . . . . . . . 38
4.3.3 Priority of Evaluation Metrics . . . . . . . . . . . . . . . . . 38
4.4 Experiment Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.4.1 Overall Result . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.4.2 Subgroup Analysis . . . . . . . . . . . . . . . . . . . . . . . 42
4.4.3 Attention Weight Visualization . . . . . . . . . . . . . . . . . . 43
Chapter 5 Discussion 45
5.1 Is TranSCANE Really Unaffected by Different Imputations? . . . . . 45
5.2 What Happens if We Disrupt The Time Information of The Testing
Samples? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
5.3 Subgroup Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
5.4 What Do We Learn From the Attention Map? . . . . . . . . . . . . . 48
5.5 Why Should We Choose SCANE? . . . . . . . . . . . . . . . . . . . 49
5.6 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
Chapter 6 Conclusions 54
Bibliography 56
-
dc.language.isoen-
dc.subject多變量時間序列數據zh_TW
dc.subject深度學習zh_TW
dc.subject表徵學習zh_TW
dc.subject缺失值zh_TW
dc.subjectmissing valueen
dc.subjectrepresentation learningen
dc.subjectmultivariate time series dataen
dc.subjectdeep learningen
dc.title透過可縮放的數值嵌入向量從不規則的多變量時間序列資料中學習:以電子健康紀錄為研究案例zh_TW
dc.titleLearning From Irregular Multivariate Time Series Data with Scalable Numerical Embedding: A Case Study in Electronic Health Recorden
dc.typeThesis-
dc.date.schoolyear112-1-
dc.description.degree碩士-
dc.contributor.oralexamcommittee孫紹華;陳縕儂zh_TW
dc.contributor.oralexamcommitteeShao-Hua Sun;Yun-Nung Chenen
dc.subject.keyword深度學習,多變量時間序列數據,缺失值,表徵學習,zh_TW
dc.subject.keyworddeep learning,multivariate time series data,missing value,representation learning,en
dc.relation.page62-
dc.identifier.doi10.6342/NTU202400619-
dc.rights.note未授權-
dc.date.accepted2024-02-17-
dc.contributor.author-college電機資訊學院-
dc.contributor.author-dept電信工程學研究所-
顯示於系所單位:電信工程學研究所

文件中的檔案:
檔案 大小格式 
ntu-112-1.pdf
  未授權公開取用
2.21 MBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved