Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊工程學系
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/99662
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor傅立成zh_TW
dc.contributor.advisorLi-Chen Fuen
dc.contributor.author廖庭筠zh_TW
dc.contributor.authorTing-Yun Liaoen
dc.date.accessioned2025-09-17T16:18:04Z-
dc.date.available2025-09-18-
dc.date.copyright2025-09-17-
dc.date.issued2025-
dc.date.submitted2025-08-06-
dc.identifier.citation[1] Yu Zhou, Lan Wei, Song Gao, Jun Wang, and Zhigang Hu. Characterization of diffusion magnetic resonance imaging revealing relationships between white matter disconnection and behavioral disturbances in mild cognitive impairment: a systematic review. Frontiers in Neuroscience, 17:1209378, 06 2023.
[2] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need. In Advances in neural information processing systems, 2017.
[3] Alexei Baevski, Yuhao Zhou, Abdelrahman Mohamed, and Michael Auli. wav2vec 2.0: A framework for self-supervised learning of speech representations. In Advances in Neural Information Processing Systems (NeurIPS), 2020.
[4] Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. A simple framework for contrastive learning of visual representations. In International Conference on Machine Learning, pages 1597–1607. PMLR, 2020.
[5] Prannay Khosla, Piotr Teterwak, Chen Wang, Aaron Sarna, Yonglong Tian, Phillip Isola, Aaron Maschinot, Ce Liu, and Dilip Krishnan. Supervised contrastive learning. In Advances in Neural Information Processing Systems, volume 33, pages 18661–18673, 2020.
[6] Yu-Shan Liao. Dual-modal longitudinal cognitive impairment detection system based on autobiographical memory test. Master’s thesis, National Taiwan University, 2024.
[7] Harold Goodglass, Edith Kaplan, and Barbara Barresi. Boston Diagnostic Aphasia Examination (3rd ed.). Lippincott Williams & Wilkins, 2001.
[8] World Health Organization. Dementia. https://www.who.int/news-room/fact-sheets/detail/dementia, 2021. Accessed: 2025-06-24.
[9] Alzheimer’s Disease International. Dementia statistics. https://www.alzint.org/about/dementia-facts-figures/dementia-statistics/, 2023. Accessed: 2025-06-24.
[10] Alzheimer’s Disease International. World alzheimer report 2019: Attitudes to dementia. https://www.alzint.org/u/WorldAlzheimerReport2019.pdf, 2019. Accessed: 2025-06-24.
[11] Ronald C Petersen. Mild cognitive impairment as a diagnostic entity. Journal of Internal Medicine, 256(3):183–194, 2004.
[12] Ho-Ling Chang, Thiri Wai, Yu-Shan Liao, Sheng-Ya Lin, Yu-Ling Chang, and LiChen Fu. A dual-modal fusion framework for detection of mild cognitive impairment based on autobiographical memory. IEEE Journal of Biomedical and Health Informatics, 29(6):4474–4485, 2025.
[13] Kathleen C. Fraser, Kristina Lundholm Fors, Marie Eckerström, Fredrik Öhman, and Dimitrios Kokkinakis. Predicting mci status from multimodal language data using cascaded classifiers. Frontiers in Aging Neuroscience, 11, 2019.
[14] Kimberly D Mueller, Rebecca L Koscik, Bruce P Hermann, Sterling C Johnson, and Lyn S Turkstra. Declines in connected language are associated with very early mild cognitive impairment: Results from the wisconsin registry for alzheimer’s prevention. Frontiers in Aging Neuroscience, 9:437, 2017.
[15] Stina Saunders, Fasih Haider, Craig W. Ritchie, Graciela Muniz Terrera, and Saturnino Luz. Longitudinal observational cohort study: Speech for intelligent cognition change tracking and detection of alzheimer’s disease (side-ad). BMJ Open, 14(3):e082388, 2024.
[16] Kathleen C Fraser, Jed A Meltzer, and Frank Rudzicz. Linguistic features identify alzheimer’s disease in narrative speech. Journal of Alzheimer’s Disease, 49(2):407–422, 2016.
[17] Cláudia Drummond, Gabriel Coutinho, Rochele Paz Fonseca, Naima Assunção, Alina Teldeschi, Ricardo de Oliveira-Souza, Jorge Moll, Fernanda Tovar-Moll, and Paulo Mattos. Deficits in narrative discourse elicited by visual stimuli are already present in patients with mild cognitive impairment. Frontiersin Aging Neuroscience, 7:96, 2015.
[18] Stephanie Simpson, Mona Eskandaripour, and Brian Levine. Effects of healthy and neuropathological aging on autobiographical memory: A meta-analysis of studies using the autobiographical interview. The Journals of Gerontology, Series B: Psychological Sciences and Social Sciences, 78(10):1617–1624, 2023.
[19] Jeffrey L. Elman. Finding structure in time. Cognitive science, 14(2):179–211, 1990.
[20] Kyunghyun Cho, Bart Van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. Learning phrase representations using RNN encoder–decoder for statistical machine translation. arXiv preprint arXiv:1406.1078, pages 1724–1734, 2014.
[21] Maria Ly, Gary Yu, Sang Joon Son, Tharick Pascoal, Helmet T. Karim, and the Alzheimer’s disease Neuroimaging Initiative . Longitudinal accelerated brain age in mild cognitive impairment and alzheimer’s disease. Frontiers in Aging Neuroscience, 16:1433426, 2024.
[22] Ziad S Nasreddine, Natalie A Phillips, Valérie Bédirian, Simon Charbonneau, Victor Whitehead, Isabelle Collin, Jeffrey L Cummings, and Howard Chertkow. The montreal cognitive assessment, moca: a brief screening tool for mild cognitive impairment. Journal of the American Geriatrics Society, 53(4):695–699, 2005.
[23] Marshal F Folstein, Susan E Folstein, and Paul R McHugh. ”mini-mental state”. a practical method for grading the cognitive state of patients for the clinician. Journal of Psychiatric Research, 12(3):189–198, 1975.
[24] Pimarn Kantithammakorn, Proadpran Punyabukkana, Ploy N Pratanwanich, Solaphat Hemrungrojn, Chaipat Chunharas, and Dittaya Wanvarie. Using automatic speech recognition to assess thai speech language fluency in the montreal cognitive assessment (moca). Sensors (Basel), 22(4):1583, 2022.
[25] Matteo Luperto, Marta Romeo, Francesca Lunardini, Nicola Basilico, Carlo Abbate, Ray Jones, Angelo Cangelosi, Simona Ferrante, and N. Alberto Borghese. Evaluating the acceptability of assistive robots for early detection of mild cognitive impairment. In 2019 IEEE/RSJInternational Conference on Intelligent Robots and Systems(IROS), pages 1257–1264. IEEE, 2019.
[26] Pegah Hafiz, Kamilla Woznica Miskowiak, Lars Vedel Kessing, Andreas Elleby Jespersen, Kia Obenhausen, Lorant Gulyas, Katarzyna Zukowska, and Jakob Eyvind Bardram. The internet-based cognitive assessment tool: System design and feasibility study. JMIR Formative Research, 3(3):e13898, 2019.
[27] Saturnino Luz, Fasih Haider, Sofia de la Fuente, Davida Fromm, and Brian MacWhinney. Alzheimer’s dementia recognition through spontaneous speech: The adress challenge. In Interspeech 2020, pages 2172–2176, 2020.
[28] Saturnino Luz, Fasih Haider, Sofia de la Fuente Garcia, Davida Fromm, and Brian Macwhinney. Detecting cognitive decline using speech only: The adresso challenge, 2021.
[29] Benjamin Barrera-Altuna, Daeun Lee, Zaima Zarnaz, Jinyoung Han, and Seungbae Kim. The interspeech 2024 taukadial challenge: Multilingual mild cognitive impairment detection with multimodal approach. In Interspeech 2024, pages 967–971, 2024.
[30] PROCESS Challenge Organizers. Process: Prediction and recognition of cognitive decline through spontaneous speech. https://processchallenge.github.io/, 2025. Accessed: 2025-06-21.
[31] Ayaka Yamanaka, Ikuma Sato, Shuichi Matsumoto, and Yuichi Fujino. Mild cognitive impairment screening system by multiple daily activity information—a method based on daily conversation. In Proceedings of Eighth International Congress on Information and Communication Technology, pages 349–360. Springer Nature Singapore, 2023.
[32] Laszlo Toth, Ildiko Hoffmann, Gabor Gosztolya, Veronika Vincze, Greta Szatloczki, Zoltan Banreti, Magdolna Pakaski, and Janos Kalman. A speech recognition-based solution for the automatic detection of mild cognitive impairment from spontaneous speech. Current Alzheimer Research, 15(2):130–138, 2018.
[33] Aparna Balagopalan, Benjamin Eyre, Frank Rudzicz, and Jekaterina Novikova. To bert or not to bert: Comparing speech and language-based approaches for alzheimer’s disease detection. In Proceedings of Interspeech 2020, pages 2167–2171, 2020.
[34] Kristin Qi, Jiatong Shi, Caroline Summerour, John A. Batsis, and Xiaohui Liang. Exploiting longitudinal speech sessions via voice assistant systems for early detection of cognitive decline. arXiv preprint arXiv:2410.12885, 2024.
[35] Sabah Al‑Hameed. Audio Based Signal Processing and Computational Models for Early Detection and Prediction of Dementia and Mood Disorders. Phd thesis, University of Sheffield, Sheffield, UK, 2019.
[36] Jordi Laguarta and Brian Subirana. Longitudinal speech biomarkers for automated alzheimer’s detection. Frontiers in Computer Science, 3:624694, 2021.
[37] Yasunori Yamada, Kaoru Shinkawa, and Keita Shimmei. Atypical repetition in daily conversation on different days for detecting alzheimer disease: Evaluation of phonecall data from regular monitoring service. JMIR Mental Health, 7(1):e16790, 2020.
[38] Florian Eyben, Klaus R. Scherer, Björn W. Schuller, Johan Sundberg, Elisabeth André, Carlos Busso, Laurence Y. Devillers, Julien Epps, Petri Laukka, Shrikanth S. Narayanan, and Khiet P. Truong. The geneva minimalistic acoustic parameter set (gemaps) for voice research and affective computing. IEEE Transactions on Affective Computing, 7(2):190–202, 2016.
[39] Sercan Ö Arik and Tomas Pfister. Tabnet: Attentive interpretable tabular learning. In Proceedings of the AAAI conference on artificial intelligence, volume 35, pages 6679–6687, 2021.
[40] Sheng-Ya Lin, Ho-Ling Chang, Thiri Wai, Li-Chen Fu, and Yu-Ling Chang. Contrast-enhanced automatic cognitive impairment detection system with pauseencoder. In 2022 IEEE International Conference on Systems, Man, and Cybernetics (SMC), pages 2796–2801, 2022.
[41] Liang Wang, Nan Yang, Xiaolong Huang, Linjun Yang, Rangan Majumder, and Furu Wei. Multilingual e5 text embeddings: A technical report, 2024.
[42] Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. Deep learning. Nature, 521(7553):436–444, 2015.
[43] Dan Hendrycks and Kevin Gimpel. Gaussian error linear units (gelus). arXiv preprint arXiv:1606.08415, 2016.
[44] Claude E Shannon. A mathematical theory of communication. The Bell System Technical Journal, 27(3):379–423, 1948.
[45] Sebastian Ruder. An overview of gradient descent optimization algorithms. arXiv preprint arXiv:1609.04747, 2016.
[46] Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
[47] Ilya Loshchilov and Frank Hutter. Decoupled weight decay regularization. In International Conference on Learning Representations, 2019.
[48] Sepp Hochreiter and Jürgen Schmidhuber. Long short-term memory. Neural Computation, 9(8):1735–1780, 1997.
[49] Sen Wang, Hanxiao Chang, Zeqiu Ma, Yiming Zhang, Ziyi Yang, Hai Zheng, Yixuan Zhou, and Jiliang Tang. Text embeddings by weakly supervised contrastive pretraining. arXiv preprint arXiv:2212.03533, 2022.
[50] Phong Le-Khac, Gerard Healy, and Alan F Smeaton. Contrastive representation learning: A framework and review. IEEE Access, 8:193907–193934, 2020.
[51] Silke Matura, Kathrin Muth, Jörg Magerkurth, Henrik Walter, Johannes Klein, Corinna Haenschel, and Johannes Pantel. Neural correlates of autobiographical memory in amnestic mild cognitive impairment. Psychiatry Research: Neuroimaging, 201(2):159–167, 2012.
[52] Emilie Tramoni, Olivier Felician, Lea Koric, Marc Balzamo, Sven Joubert, and Muriel Ceccaldi. Alteration of autobiographical memory in amnestic mild cognitive impairment. Cortex, 48(10):1310–1319, 2012.
[53] Juan José G Meilán, Francisco Martínez-Sánchez, Juan Carro, Dolores E López, Lymarie Millian-Morell, and José M Arana. Speech in alzheimer’s disease: Can temporal and acoustic parameters discriminate dementia? Dementia and Geriatric Cognitive Disorders, 37(5-6):327–334, 2014.
[54] Athanasios Tsanas, Max A. Little, Patrick E. McSharry, and Lorraine O. Ramig. Nonlinear speech analysis algorithms mapped to a standard metric achieve clinically useful quantification of average parkinson’s disease symptom severity. Journal of the Royal Society Interface, 8(59):842–855, 2011.
[55] Kathleen C. Fraser, Jed A. Meltzer, Naida L. Graham, Carol Leonard, Graeme Hirst, Sandra E. Black, and Elizabeth Rochon. Automated classification of primary progressive aphasia subtypes from narrative speech transcripts. Cortex, 55:43–60, 2014.
[56] Daniel M. Low, Kate H. Bentley, and Satrajit S. Ghosh. Automated assessment of psychiatric disorders using speech: A systematic review. Laryngoscope Investigative Otolaryngology, 5(1):96–116, 2020.
[57] Yao-Hung Hubert Tsai, Shaojie Bai, Paul Pu Liang Yamada, Louis-Philippe Morency, and Ruslan Salakhutdinov. Multimodal transformer for unaligned multimodal language sequences. In Proceedings of the conference. Association for computational linguistics. Meeting, volume 2019, page 6558, 2019.
[58] Jiachen Luo, Huy Phan, Lin Wang, and Joshua D. Reiss. Bimodal connection attention fusion for speech emotion recognition. arXiv preprint arXiv:2503.05858, 2025.
[59] Inci M. Baytas, Cao Xiao, Xi Zhang, Fei Wang, Anil K. Jain, and Jiayu Zhou. Patient subtyping via time-aware lstm networks. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, page 65–74. Association for Computing Machinery, 2017.
[60] Daniel Neil, Michael Pfeiffer, and Shih-Chii Liu. Phased lstm: Accelerating recurrent network training for long or event-based sequences. Advances in neural information processing systems, 29, 2016.
[61] Zafi Sherhan Syed, Muhammad Shehram Shah Syed, Margaret Lech, and Elena Pirogova. Tackling the adresso challenge 2021: The muet-rmit system for alzheimer’s dementia recognition from spontaneous speech. In Interspeech 2021, pages 3815–3819, 2021.
[62] Morteza Rohanian, Julian Hough, and Matthew Purver. Alzheimer’s dementia recognition using acoustic, lexical, disfluency and speech pause features robust to noisy inputs. arXiv preprint arXiv:2106.15684, 2021.
[63] Youxiang Zhu, Abdelrahman Obyat, Xiaohui Liang, John A Batsis, and Robert M Roth. Wavbert: Exploiting semantic and non-semantic speech using wav2vec and bert for dementia detection. Interspeech, pages 3790–3794, 2021.
[64] Yangwei Ying, Tao Yang, and Hong Zhou. Multimodal fusion for alzheimer’s disease recognition. Applied Intelligence, 53(12):16029–16040, 2023.
-
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/99662-
dc.description.abstract隨著全球人口快速老化,失智症的盛行率隨之攀升,對整體醫療系統與公共健康資源形成沉重壓力。輕度認知障礙(MCI)是介於正常老化與失智症之間的過渡階段,也是進行早期偵測與干預的關鍵時機。但現有診斷方式如磁振造影或生物標記具備高成本與侵入性,不利於大規模應用。近年研究指出,自傳式記憶(AM)語音資料有潛力成為辨識認知衰退的非侵入性早期指標。本研究提出一套時間感知式的雙模態縱向MCI偵測架構,整合語音與文字特徵,並針對受試者在多時間點的AM語音資料進行建模。我們設計了Cross-Visit Encoder與一種新穎的序列模型架構TiGRU(Temporal-infused GRU),將每期資料與前期內容透過cross attention對齊後,計算認知差異並將時間嵌入向量一同送入TiGRU,提升模型對於不規則訪視間隔與非線性認知變化的感知能力。弱對齊後的語音與文字分別經由wav2vec2、OpenSMILE、multilingual-e5語意編碼與額外的詞彙特徵,再透過Bidirectional Cross Attention進行雙模態融合。在NTU-AM資料集上的實驗顯示,我們的模型在MCI偵測任務中可達87%和88%的F1-score,以及90%與95%的AUROC,優於傳統GRU與未建模時間的架構。消融實驗亦驗證各模組在模型效能上的貢獻。本研究證實,結合時間與雙模態特徵的縱向模型,能有效捕捉長期認知變化與非線性認知軌跡,提供具備可擴展性與臨床潛力的 MCI 偵測方法。zh_TW
dc.description.abstractAs populations around the world grow older, dementia has become increasingly common, creating mounting pressure on both public health and medical infrastructure. Mild Cognitive Impairment(MCI), an intermediate stage between normal aging and dementia, represents a critical window for early detection and intervention. However, conventional diagnostic tools such as MRI and biomarkers are costly and invasive, limiting their scalability in large populations. Recent studies suggest that autobiographical memory(AM) speech may serve as a non-invasive early indicator of cognitive decline.
In this study, we propose a temporal-aware dual-modal longitudinal framework for MCI detection by integrating acoustic and linguistic features from participants' AM speech collected across multiple visits. To address challenges in unstructured speech, modality misalignment, and temporal modeling, we design a Cross-Visit Encoder and a novel sequential model, called TiGRU(Temporal-infused GRU). This architecture aligns the data collected at the current and at the previous visits via cross attention, captures cognitive shifts, and incorporates time interval embeddings into the TiGRU to enhance sensitivity to irregular visit spacing and nonlinear cognitive changes. Weakly aligned acoustics and text inputs from the speech are processed through wav2vec2, OpenSMILE, multilinguale5, and lexical feature extractors, and then are fused via Bidirectional Cross Attention with residual connections for robust multimodal integration. Finally, the model predicts the MCI status at the last visit by leveraging information across all visits, enabling early detection and longitudinal tracking of cognitive decline.
Experiments on the NTU-AM dataset demonstrate that our model achieves F1-scores of 87% and 88%, and AUROCs of 90% and 95% on the recall and probing data, outperforming traditional GRU-based and temporal-agnostic baselines. Ablation experiments further validate how each proposed component impacts the model’s overall performance. Our results highlight the potential of combining temporal modeling and multimodal learning for effectively capturing long-term cognitive shifts and nonlinear cognitive trajectories, offering a scalable and clinically promising approach to early MCI detection.
en
dc.description.provenanceSubmitted by admin ntu (admin@lib.ntu.edu.tw) on 2025-09-17T16:18:04Z
No. of bitstreams: 0
en
dc.description.provenanceMade available in DSpace on 2025-09-17T16:18:04Z (GMT). No. of bitstreams: 0en
dc.description.tableofcontents誌謝 i
中文摘要 iii
Abstract iv
Contents vi
List of Figures x
List of Tables xi
Chapter 1 Introduction 1
1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Research Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3.1 Acoustics–Text Temporal Alignment . . . . . . . . . . . . . . . . . 4
1.3.2 Diagnostic Challenges with Unstructured Speech . . . . . . . . . . 5
1.3.3 Complexity of Temporal Modeling . . . . . . . . . . . . . . . . . . 6
1.4 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4.1 Speech-based Cognitive Detection Methods . . . . . . . . . . . . . 6
1.4.2 Longitudinal Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.5 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.6 Thesis Organization . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Chapter 2 Preliminaries 13
2.1 Deep Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.1.1 Neural Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.1.2 Activation Function . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.1.3 Loss Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.1.4 Optimizer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.1.5 Gated Recurrent Unit (GRU) . . . . . . . . . . . . . . . . . . . . . 19
2.2 Neural Attention Mechanisms . . . . . . . . . . . . . . . . . . . . . 20
2.2.1 Self-Attention Mechanism . . . . . . . . . . . . . . . . . . . . . . 20
2.2.2 Cross-Attention Mechanism . . . . . . . . . . . . . . . . . . . . . 20
2.3 Pre-training and Fine-tuning Paradigm . . . . . . . . . . . . . . . . . 22
2.3.1 Pretrained Speech Encoder: Wav2Vec 2.0 . . . . . . . . . . . . . . 22
2.3.2 Pretrained Text Encoder: Multilingual E5-small . . . . . . . . . . . 23
2.4 Contrastive Learning . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.4.1 Instance-level Contrastive Learning: SimCLR . . . . . . . . . . . . 24
2.4.2 Supervised Contrastive Learning: SupCon Loss . . . . . . . . . . . 25
Chapter 3 Methodology 26
3.1 System Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.2 Data Collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.3 Problem Setting and Formulation . . . . . . . . . . . . . . . . . . . 29
3.3.1 Longitudinal Problem Setting . . . . . . . . . . . . . . . . . . . . . 29
3.3.2 Cross-sectional Problem Setting . . . . . . . . . . . . . . . . . . . 30
3.4 Data Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.4.1 Linguistic Feature Extraction . . . . . . . . . . . . . . . . . . . . . 31
3.4.2 Acoustic Feature Extraction . . . . . . . . . . . . . . . . . . . . . . 33
3.5 Dual-Modal Temporal-Aware Longitudinal Model . . . . . . . . . . 34
3.5.1 Linguistic Encoder . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.5.2 Acoustic Encoder . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.5.3 Temporal Encoder . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.5.4 Bidirectional Cross-Attention Fusion Layer . . . . . . . . . . . . . 38
3.5.5 Cross-Visit Encoder . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.5.6 Temporal-infused GRU (TiGRU) . . . . . . . . . . . . . . . . . . . 40
3.5.7 Classifier and Overall Loss . . . . . . . . . . . . . . . . . . . . . . 43
Chapter 4 Experiments 45
4.1 Experiment Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.1.1 Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4.1.2 Implementation Details . . . . . . . . . . . . . . . . . . . . . . . . 48
4.2 Evaluation Protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.2.1 Evaluation Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.2.2 Baselines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.3 NTU-AM-LG Dataset . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.3.1 Overall Performance . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.3.2 Ablation Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.3.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
4.4 NTU-AM-CS Dataset . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.4.1 Ablation Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.4.2 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
4.5 Additional Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
4.6 Hyper-parameter Configuration . . . . . . . . . . . . . . . . . . . . 62
Chapter 5 Conclusion 64
References 67
-
dc.language.isoen-
dc.subject雙模態學習zh_TW
dc.subject輕度認知障礙zh_TW
dc.subject縱向研究zh_TW
dc.subject不規則時間序列建模zh_TW
dc.subject非結構化自發性語音zh_TW
dc.subject認知分類任務zh_TW
dc.subject快篩系統zh_TW
dc.subjectDual-modal learningen
dc.subjectScreening systemen
dc.subjectCognitive classification tasken
dc.subjectUnstructured spontaneous speechen
dc.subjectIrregular Time Series Modelingen
dc.subjectLongitudinal analysisen
dc.subjectMild cognitive impairmenten
dc.title基於TiGRU的雙模態縱向研究:自傳式記憶資料於輕度認知障礙之偵測應用zh_TW
dc.titleTiGRU: A Dual-Modal Longitudinal Model for MCI Detection Using Autobiographical Memoryen
dc.typeThesis-
dc.date.schoolyear113-2-
dc.description.degree碩士-
dc.contributor.oralexamcommittee邱銘章;張玉玲;李宏毅;林澤zh_TW
dc.contributor.oralexamcommitteeMing-Jang Chiu;Yu-Ling Chang;Hung-Yi Lee;Che Linen
dc.subject.keyword雙模態學習,輕度認知障礙,縱向研究,不規則時間序列建模,非結構化自發性語音,認知分類任務,快篩系統,zh_TW
dc.subject.keywordDual-modal learning,Mild cognitive impairment,Longitudinal analysis,Irregular Time Series Modeling,Unstructured spontaneous speech,Cognitive classification task,Screening system,en
dc.relation.page76-
dc.identifier.doi10.6342/NTU202500828-
dc.rights.note未授權-
dc.date.accepted2025-08-11-
dc.contributor.author-college電機資訊學院-
dc.contributor.author-dept資訊工程學系-
dc.date.embargo-liftN/A-
顯示於系所單位:資訊工程學系

文件中的檔案:
檔案 大小格式 
ntu-113-2.pdf
  未授權公開取用
4.79 MBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved