請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/94751完整後設資料紀錄
| DC 欄位 | 值 | 語言 |
|---|---|---|
| dc.contributor.advisor | 黃漢邦 | zh_TW |
| dc.contributor.advisor | Han-Pang Huang | en |
| dc.contributor.author | 王易騰 | zh_TW |
| dc.contributor.author | Yi-Teng Wang | en |
| dc.date.accessioned | 2024-08-16T17:59:19Z | - |
| dc.date.available | 2024-08-17 | - |
| dc.date.copyright | 2024-08-16 | - |
| dc.date.issued | 2024 | - |
| dc.date.submitted | 2024-08-06 | - |
| dc.identifier.citation | [1] "109年人口及住宅普查初步統計結果提要分析." https://www.stat.gov.tw/public/Data/1105165329GEYI9Z14.pdf (accessed June, 2024).
[2] "2024 Alzheimer's disease facts and figures," Alzheimers Dement, vol. 20, no. 5, pp. 3708-3821, 2024. [3] "ChatGPT." https://chatgpt.com/ (accessed June, 2024). [4] "Chinese/Taiwanese/Hakka Machine Translation." http://tts001.iptcloud.net:8802/ (accessed June, 2024). [5] "Kaldi Speech Recognition Toolkit." https://github.com/kaldi-asr/kaldi (accessed June, 2024). [6] "National Development Council. Population Aging." https://www.ndc.gov.tw/en/Content_List.aspx?n=85E9B2CDF4406753 (accessed June, 2024). [7] "SpeechRecognition." https://github.com/Uberi/speech_recognition (accessed June, 2024). [8] "Taiwanese Speech Synthesis." http://tts001.iptcloud.net:8804/ (accessed June, 2024). [9] "World Health Oranization. Ageing and health." https://www.who.int/news-room/fact-sheets/detail/ageing-and-health (accessed June, 2024). [10] "World Health Organization. Dementia." https://www.who.int/news-room/fact-sheets/detail/dementia (accessed June, 2024). [11] J. Achiam, S. Adler, S. Agarwal, L. Ahmad, I. Akkaya, F. L. Aleman, D. Almeida, J. Altenschmidt, S. Altman, and S. Anadkat, "Gpt-4 technical report," arXiv preprint arXiv:2303.08774, 2023. [12] R. Adolphs, D. Tranel, S. Hamann, A. W. Young, A. J. Calder, E. A. Phelps, A. Anderson, G. P. Lee, and A. R. Damasio, "Recognition of facial emotion in nine individuals with bilateral amygdala damage," Neuropsychologia, vol. 37, no. 10, pp. 1111-7, 1999. [13] J. E. Ahlskog, Y. E. Geda, N. R. Graff-Radford, and R. C. Petersen, "Physical exercise as a preventive or disease-modifying treatment of dementia and brain aging," Mayo Clin Proc, vol. 86, no. 9, pp. 876-84, 2011. [14] S. Bae, D. Kwak, S. Kim, D.-h. Ham, S. Kang, S.-W. Lee, and W. C. Park, "Building a Role Specified Open-Domain Dialogue System Leveraging Large-Scale Language Models," 2022. [15] A. Baevski, H. Zhou, A.-r. Mohamed, and M. Auli, "wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations," arXiv preprint arXiv:2006.11477, 2020. [16] V. Bazarevsky, I. Grishchenko, K. Raveendran, T. Zhu, F. Zhang, and M. Grundmann, "Blazepose: On-device real-time body pose tracking," arXiv preprint arXiv:2006.10204, 2020. [17] E. S. Bogardus, "Social distance and its origins," Journal of Applied Sociology, vol. 9, pp. 216-226, 1992. [18] A. Bordes and J. Weston, "Learning End-to-End Goal-Oriented Dialog," arXiv preprint arXiv:1605.07683, 2016. [19] T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, and A. Askell, "Language models are few-shot learners," Advances in neural information processing systems, vol. 33, pp. 1877-1901, 2020. [20] S. Buechel and U. Hahn, "EmoBank: Studying the Impact of Annotation Perspective and Representation Format on Dimensional Emotion Analysis," Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, Valencia, Spain, vol. 2, pp. 578-585, 2017. [21] S. Buechel and U. Hahn, "Readers vs. Writers vs. Texts: Coping with Different Perspectives of Text Understanding in Emotion Annotation," Proceedings of the 11th Linguistic Annotation Workshop @ EACL 2017, Valencia, Spain, pp. 1-12, 2017. [22] A. Bulat and G. Tzimiropoulos, "How far are we from solving the 2d & 3d face alignment problem?(and a dataset of 230,000 3d facial landmarks)," Proceedings of the 2017 IEEE international conference on computer vision, pp. 1021-1030, 2017. [23] G. Cámbara, J. Luque, and M. Farrús, "Convolutional speech recognition with pitch and voice quality features," arXiv preprint arXiv:2009.01309, 2020. [24] Q. Cao, L. Shen, W. Xie, O. M. Parkhi, and A. Zisserman, "VGGFace2: A Dataset for Recognising Faces across Pose and Age," 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018), pp. 67-74, 2018. [25] G. Castellano, L. Kessous, and G. Caridakis, "Emotion Recognition through Multiple Modalities: Face, Body Gesture, Speech," in Affect and Emotion in Human-Computer Interaction: From Theory to Applications. Berlin, Heidelberg, pp. 92-103, 2008. [26] W. Chan, N. Jaitly, Q. V. Le, and O. Vinyals, "Listen, attend and spell," arXiv preprint arXiv:1508.01211, 2015. [27] H. Chen, X. Liu, D. Yin, and J. Tang, "A Survey on Dialogue Systems: Recent Advances and New Frontiers," arXiv preprint arXiv:1711.01731, 2017. [28] Y. Cheng, B. Wang, B. Yang, and R. T. Tan, "Monocular 3D multi-person pose estimation by integrating top-down and bottom-up networks," 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, pp. 7649-7659, 2021. [29] C. Cortes and V. Vapnik, "Support-vector networks," Machine Learning, vol. 20, no. 3, pp. 273-297, 1995. [30] N. Dalal and B. Triggs, "Histograms of oriented gradients for human detection," 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05), San Diego, CA, USA, vol. 1, pp. 886-893, 2005. [31] N. Dave, "Feature Extraction Methods LPC, PLP and MFCC In Speech Recognition," International journal for advance research in engineering and technology, vol. 1, no. 6, pp. 1-4, 2013. [32] Y. Deng, L. Liao, L. Chen, H. Wang, W. Lei, and T.-S. Chua, "Prompting and Evaluating Large Language Models for Proactive Dialogues: Clarification, Target-guided, and Non-collaboration," Conference on Empirical Methods in Natural Language Processing, Singapore, pp. 10602-10621, 2023. [33] Y. C. Deng, Y. R. Wang, S. H. Chen, and L. H. Lee, "Toward Transformer Fusions for Chinese Sentiment Intensity Prediction in Valence-Arousal Dimensions," IEEE Access, vol. 11, pp. 109974-109982, 2023. [34] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding," arXiv preprint arXiv:1810.04805, 2018. [35] J. Diehl-Schmid, C. Pohl, C. Ruprecht, S. Wagenpfeil, H. Foerstl, and A. Kurz, "The Ekman 60 Faces Test as a diagnostic instrument in frontotemporal dementia," Archives of Clinical Neuropsychology, vol. 22, no. 4, pp. 459-464, 2007. [36] J. K. Eamonn and J. P. Michael, "Derivative Dynamic Time Warping," Proceedings of the 2001 SIAM international conference on data mining, Chicago, IL, USA, pp. 1-11, 2001. [37] P. Ekman, "Universals and cultural differences in facial expressions of emotion," Nebraska Symposium on Motivation, vol. 19, pp. 207-283, 1971. [38] N. J. Emery, "The eyes have it: the neuroethology, function and evolution of social gaze," Neuroscience & Biobehavioral Reviews, vol. 24, no. 6, pp. 581-604, 2000. [39] S. M. Fiore, T. J. Wiltshire, E. J. Lobato, F. G. Jentsch, W. H. Huang, and B. Axelrod, "Toward understanding social cues and signals in human-robot interaction: effects of robot gaze and proxemic behavior," Front Psychol, vol. 4, p. 859, 2013. [40] T. Fleiner, S. Leucht, H. Förstl, W. Zijlstra, and P. Haussermann, "Effects of Short-Term Exercise Interventions on Behavioral and Psychological Symptoms in Patients with Dementia: A Systematic Review," J Alzheimers Dis, vol. 55, no. 4, pp. 1583-1594, 2017. [41] N. Garcia-Casares, R. M. Moreno-Leiva, and J. A. Garcia-Arnes, "Music therapy as a non-pharmacological treatment in Alzheimer's disease. A systematic review," Revista de Neurologia, vol. 65, no. 12, pp. 529-538, 2017. [42] S. Gupta, J. Parikh, R. Jain, N. Kashi, P. Khurana, J. Mehta, and J. Hemanth, "Dementia detection using parameter optimization for multimodal datasets," Intell. Decis. Technol., vol. 18, pp. 343-369, 2023. [43] E. T. Hall, The hidden dimension, 1st ed. New York, NY, US: Doubleday & Co, pp. xii, 201, 1966. [44] G. Hatice and P. Massimo, "Bi-modal emotion recognition from expressive face and body gestures," Journal of Network and Computer Applications, vol. 30, no. 4, pp. 1334-1345, 2007. [45] W.-N. Hsu, B. Bolte, Y.-H. H. Tsai, K. Lakhotia, R. Salakhutdinov, and A.-r. Mohamed, "HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units," IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 29, pp. 3451-3460, 2021. [46] L. Hung, M. Gregorio, J. Mann, C. Wallsworth, N. Horne, A. Berndt, C. Liu, E. Woldum, A. Au-Yeung, and H. Chaudhury, "Exploring the perceptions of people with dementia about the social robot PARO in a hospital setting," Dementia (London), vol. 20, no. 2, pp. 485-504, 2021. [47] C. E. Izard, Human emotions, 1 st ed. New York: Springer US, 1977. [48] C. R. Jack, Jr., R. C. Petersen, Y. C. Xu, S. C. Waring, P. C. O'Brien, E. G. Tangalos, G. E. Smith, R. J. Ivnik, and E. Kokmen, "Medial temporal atrophy on MRI in normal aging and very mild Alzheimer's disease," Neurology, vol. 49, no. 3, pp. 786-94, 1997. [49] M. A. Jalal, E. Loweimi, R. K. Moore, and T. Hain, "Learning Temporal Clusters Using Capsule Routing for Speech Emotion Recognition," Proc. Interspeech 2019, Graz, Austria, pp. 1701-1705, 2019. [50] A. Kołakowska, A. Landowska, M. Szwoch, W. Szwoch, and M. Wróbel, Modeling emotions for affect-aware applications. pp. 55-67, 2015. [51] J. Kossaifi, G. Tzimiropoulos, S. Todorovic, and M. Pantic, "AFEW-VA database for valence and arousal estimation in-the-wild," Image and Vision Computing, vol. 65, pp. 23-36, 2017. [52] A. Krizhevsky, I. Sutskever, and G. E. Hinton, "Imagenet classification with deep convolutional neural networks," Advances in neural information processing systems, vol. 25, pp. 1097-1105, 2012. [53] W. H. Kruskal and W. A. Wallis, "Use of Ranks in One-Criterion Variance Analysis," Journal of the American Statistical Association, vol. 47, no. 260, pp. 583-621, 1952. [54] M. R. Kumar, S. Vekkot, S. Lalitha, D. Gupta, V. J. Govindraj, K. Shaukat, Y. A. Alotaibi, and M. Zakariah, "Dementia Detection from Speech Using Machine Learning and Deep Learning Architectures," Sensors, vol. 22, 2022. [55] D. Lala, K. Inoue, P. Milhorat, and T. Kawahara, "Detection of social signals for recognizing engagement in human-robot interaction," arXiv reprinted arXiv:1709.10257, 2017. [56] L.-H. Lee, J.-H. Li, and L.-C. Yu, "Chinese EmoBank: Building Valence-Arousal Resources for Dimensional Sentiment Analysis," ACM Trans. Asian Low-Resour. Lang. Inf. Process., vol. 21, no. 4, pp. 1-18, 2022. [57] H. Li, Z. Lin, X. Shen, J. Brandt, and G. Hua, "A convolutional neural network cascade for face detection," 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, pp. 5325-5334, 2015. [58] S. Li and W. Deng, "Deep Facial Expression Recognition: A Survey," IEEE Transactions on Affective Computing, vol. 13, no. 3, pp. 1195-1215, 2022. [59] Y. F. Liao, J. S. Tsay, P. Kang, H. L. Khoo, L. K. Tan, L. C. Chang, U. G. Iunn, H. L. Su, T. G. Thiann, H. K. Tiun, and S. L. Liao, "Taiwanese Across Taiwan Corpus And Its Applications," 2022 25th Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA), Hanoi, Vietnam, pp. 1-5, 2022. [60] Y. P. Lin, C. H. Wang, T. P. Jung, T. L. Wu, S. K. Jeng, J. R. Duann, and J. H. Chen, "EEG-based emotion recognition in music listening," IEEE Trans Biomed Eng, vol. 57, no. 7, pp. 1798-806, 2010. [61] G. Livingston, J. Huntley, A. Sommerlad, D. Ames, C. Ballard, S. Banerjee, C. Brayne, A. Burns, J. Cohen-Mansfield, and C. Cooper, "Dementia prevention, intervention, and care: 2020 report of the Lancet Commission," The Lancet, vol. 396, no. 10248, pp. 413-446, 2020. [62] S. R. Livingstone and F. A. Russo, "The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS): A dynamic, multimodal set of facial and vocal expressions in North American English," PLoS One, vol. 13, no. 5, p. e0196391, 2018. [63] I. Loshchilov and F. Hutter, "Sgdr: Stochastic gradient descent with warm restarts," arXiv preprint arXiv:1608.03983, 2016. [64] H. Lv, G. Yang, H. Zhou, X. Huang, H. Yang, and Z. Pang, "Teleoperation of Collaborative Robot for Remote Dementia Care in Home Environments," IEEE J Transl Eng Health Med, vol. 8, 2020. [65] P. Maresova and B. Klimova, "Supporting Technologies for Old People with Dementia: A Review," IFAC-PapersOnLine, vol. 48, no. 4, pp. 129-134, 2015. [66] R. Mead and M. J. Matarić, "Autonomous human–robot proxemics: socially aware navigation based on interaction potential," Autonomous Robots, vol. 41, pp. 1189 - 1201, 2016. [67] A. Miltiadous, K. D. Tzimourta, V. Aspiotis, T. Afrantou, M. G. Tsipouras, N. Giannakeas, E. Glavas, and A. T. Tzallas, "Enhanced Alzheimer's disease and Frontotemporal Dementia EEG Detection: Combining lightGBM Gradient Boosting with Complexity Features," 2023 IEEE 36th International Symposium on Computer-Based Medical Systems (CBMS), L'Aquila, Italy, pp. 876-881, 2023. [68] S. Mitsuyoshi, F. Ren, Y. Tanaka, and S. Kuroiwa, "Non-verbal Voice Emotion Analysis System.," International journal of innovative computing, information & control: IJICIC, vol. 2, pp. 819-830, 2006. [69] T. Mittal, U. Bhattacharya, R. Chandra, A. Bera, and D. Manocha, "M3ER: Multiplicative Multimodal Emotion Recognition using Facial, Textual, and Speech Cues," Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, no. 02, pp. 1359-1367, 2020. [70] S. M. Mohammad, F. Bravo-Marquez, M. Salameh, and S. Kiritchenko, "SemEval-2018 Task 1: Affect in Tweets," Proceedings of the 12th International Workshop on Semantic Evaluation, New Orleans, Louisiana, 2018. [71] R. Mukherjee, A. Naik, S. Poddar, S. Dasgupta, and N. Ganguly, "Understanding the Role of Affect Dimensions in Detecting Emotions from Tweets: A Multi-task Approach," Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, online, pp. 2303-2307, 2021. [72] J. Mumm and B. Mutlu, "Human-robot proxemics: Physical and psychological distancing in human-robot interaction," 2011 6th ACM/IEEE International Conference on Human-Robot Interaction (HRI), Lausanne, Switzerland, pp. 331-338, 2011. [73] P. Nandwani and R. Verma, "A review on sentiment analysis and emotion detection from text," Soc Netw Anal Min, vol. 11, no. 1, p. 81, 2021. [74] Z. S. Nasreddine, N. A. Phillips, V. Bédirian, S. Charbonneau, V. Whitehead, I. Collin, J. L. Cummings, and H. Chertkow, "The Montreal Cognitive Assessment, MoCA: A Brief Screening Tool For Mild Cognitive Impairment," Journal of the American Geriatrics Society, vol. 53, no. 4, pp. 695-699, 2005. [75] A. Naveed, A. Zaher Al, and G. Shini, "A systematic survey on multimodal emotion recognition using learning algorithms," Intelligent Systems with Applications, vol. 17, 2023. [76] N. E. Neef, S. Zabel, M. Lauckner, and S. Otto, "What is Appropriate? On the Assessment of Human-Robot Proxemics for Casual Encounters in Closed Environments," International Journal of Social Robotics, vol. 15, pp. 953-967, 2023. [77] D. Q. Nguyen, T. Vu, and A. T. Nguyen, "BERTweet: A pre-trained language model for English Tweets," arXiv preprint arXiv:2005.10200, 2020. [78] D. O'Shaughnessy, Speech Communication: Human and Machine. Addison-Wesley Publishing Company, 1987. [79] A. v. d. Oord, S. Dieleman, H. Zen, K. Simonyan, O. Vinyals, A. Graves, N. Kalchbrenner, A. Senior, and K. Kavukcuoglu, "Wavenet: A generative model for raw audio," arXiv preprint arXiv:1609.03499, 2016. [80] M. Panpalli Ates and F. Yilmaz Can, "Which factors can we control the transition from mild cognitive impairment to dementia?," Journal of Clinical Neuroscience, vol. 73, pp. 108-110, 2020. [81] E. Park Robert, "The Concept of Social Distance," Journal of Applied Sociology, vol. 8, no. 6, 1923. [82] S. Park, J. Kim, S. Ye, J. Jeon, H. Park, and A. H. Oh, "Dimensional Emotion Detection from Categorical Emotion," Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, Online and Punta Cana, Dominican Republic, pp. 4367–4380, 2019. [83] K. Pearson, "On the criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling," The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science, vol. 50, no. 302, pp. 157-175, 1900. [84] D. Portugal, P. Alvito, E. Christodoulou, G. Samaras, and J. Dias, "A Study on the Deployment of a Service Robot in an Elderly Care Center," International Journal of Social Robotics, vol. 11, no. 2, pp. 317-341, 2019. [85] R. Prenger, R. Valle, and B. Catanzaro, "Waveglow: A Flow-based Generative Network for Speech Synthesis," ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, UK, pp. 3617-3621, 2019. [86] L. Pu, W. Moyle, and C. Jones, "How people with dementia perceive a therapeutic robot called PARO in relation to their pain and mood: A qualitative study," J Clin Nurs, vol. 29, no. 3-4, pp. 437-446, 2020. [87] G. Pundak, T. N. Sainath, R. Prabhavalkar, A. Kannan, and D. Zhao, "Deep Context: End-to-end Contextual Speech Recognition," 2018 IEEE Spoken Language Technology Workshop (SLT), Athens, Greece, pp. 418-425, 2018. [88] A. B. A. Qayyum, A. Arefeen, and C. Shahnaz, "Convolutional Neural Network (CNN) Based Speech-Emotion Recognition," 2019 IEEE International Conference on Signal Processing, Information, Communication & Systems (SPICSCON), Dhaka, Bangladesh, pp. 122-125, 2019. [89] Y. Qiang, Z. Ziqiong, and L. Rob, "Sentiment classification of online reviews to travel destinations by supervised machine learning approaches," Expert Systems with Applications, vol. 36, no. 3, Part 2, pp. 6527-6535, 2009. [90] L. R. Rabiner, "A tutorial on hidden Markov models and selected applications in speech recognition," Proceedings of the IEEE, vol. 77, no. 2, pp. 257-286, 1989. [91] A. Ritter, C. Cherry, and W. B. Dolan, "Data-Driven Response Generation in Social Media," Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, Edinburgh, Scotland, UK, pp. 583-593, 2011. [92] P. Rudy and T. Mike, "Sentiment analysis: A combined approach," Journal of Informetrics, vol. 3, no. 2, pp. 143-157, 2009. [93] J. A. Russell and L. F. Barrett, "Core affect, prototypical emotional episodes, and other things called emotion: dissecting the elephant," J Pers Soc Psychol, vol. 76, no. 5, pp. 805-19, 1999. [94] T. M. Rutkowski, M. S. Abe, M. Koculak, and M. Otake-Matsuura, "Classifying Mild Cognitive Impairment from Behavioral Responses in Emotional Arousal and Valence Evaluation Task - AI Approach for Early Dementia Biomarker in Aging Societies," 2020 42nd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), Montreal, QC, Canada, pp. 5537-5543, 2020. [95] Y. Santander-Cruz, S. Salazar-Colores, W. J. Paredes-García, H. Guendulain-Arenas, and S. Tovar-Arriaga, "Semantic Feature Extraction Using SBERT for Dementia Detection," Brain Sciences, vol. 12, 2022. [96] K. R. Scherer, "Vocal affect expression: a review and a model for future research," Psychological bulletin, vol. 99 2, pp. 143-65, 1986. [97] C. Shan, S. Gong, and P. W. McOwan, "Facial expression recognition based on Local Binary Patterns: A comprehensive study," Image Vis. Comput., vol. 27, pp. 803-816, 2009. [98] L. Shang and K.-P. Chan, "Temporal Exemplar-Based Bayesian Networks for Facial Expression Recognition," 2008 Seventh International Conference on Machine Learning and Applications, San Diego, CA, USA, pp. 16-22, 2008. [99] J. Shen, R. Pang, R. J. Weiss, M. Schuster, N. Jaitly, Z. Yang, Z. Chen, Y. Zhang, Y. Wang, R. Skerrv-Ryan, R. A. Saurous, Y. Agiomvrgiannakis, and Y. Wu, "Natural TTS Synthesis by Conditioning Wavenet on MEL Spectrogram Predictions," 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Calgary, Alberta, Canada, pp. 4779-4783, 2018. [100] C. Shi, Y. Zhang, and B. Liu, "A multimodal fusion-based deep learning framework combined with local-global contextual TCNs for continuous emotion recognition from videos," Applied Intelligence, vol. 54, no. 4, pp. 3040-3057, 2024. [101] W. Shinji, H. Takaaki, K. Shigeki, H. Tomoki, N. Jiro, U. Yuya, Nelson, H. Jahn, W. Matthew, C. Nanxin, R. Adithya, and O. Tsubasa, "ESPnet : End-to-End Speech Processing Toolkit," Proceedings of Interspeech, Hyderabad, India, pp. 2207-2211, 2018. [102] L. Shu, Y. Yu, W. Chen, H. Hua, Q. Li, J. Jin, and X. Xu, "Wearable Emotion Recognition Using Heart Rate Data from a Smart Bracelet," Sensors, vol. 20, no. 3, p. 718, 2020. [103] M. H. Siddiqi, R. Ali, A. M. Khan, P. Young-Tack, and L. Sungyoung, "Human facial expression recognition using stepwise linear discriminant analysis and hidden conditional random fields," IEEE Trans Image Process, vol. 24, no. 4, pp. 1386-98, 2015. [104] K. Sikka, K. Dykstra, S. Sathyanarayana, G. Littlewort, and M. Bartlett, "Multiple kernel learning for emotion recognition in the wild," Proceedings of the 15th ACM on International Conference on Multimodal Interaction, Sydney, Australia, pp. 517–524, 2013. [105] F. Strijkert, R. B. Huitema, and J. M. Spikman, "Measuring emotion recognition: Added value in diagnosing dementia of the Alzheimer's disease type," J Neuropsychol, vol. 16, no. 2, pp. 263-282, 2022. [106] Y. I. Tian, T. Kanade, and J. F. Cohn, "Recognizing action units for facial expression analysis," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 23, no. 2, pp. 97-115, 2001. [107] T. N. Tombaugh and N. J. McIntyre, "The mini-mental state examination: a comprehensive review," J Am Geriatr Soc, vol. 40, no. 9, pp. 922-35, 1992. [108] H. Touvron, T. Lavril, G. Izacard, X. Martinet, M.-A. Lachaux, T. Lacroix, B. Rozière, N. Goyal, E. Hambro, and F. Azhar, "Llama: Open and efficient foundation language models," arXiv preprint arXiv:2302.13971, 2023. [109] H. Touvron, L. Martin, K. Stone, P. Albert, A. Almahairi, Y. Babaei, N. Bashlykov, S. Batra, P. Bhargava, and S. Bhosale, "Llama 2: Open foundation and fine-tuned chat models," arXiv preprint arXiv:2307.09288, 2023. [110] C. F. Tsai, W. J. Lee, S. J. Wang, B. C. Shia, Z. Nasreddine, and J. L. Fuh, "Psychometrics of the Montreal Cognitive Assessment (MoCA) and its subscales: validation of the Taiwanese version of the MoCA and an item response theory analysis," Int Psychogeriatr, vol. 24, no. 4, pp. 651-8, 2012. [111] V. Varanya P and A. George, "Automatic recognition of facial expression using features of salient patches with SVM and ANN classifier," 2017 International Conference on Trends in Electronics and Informatics (ICEI), Tirunelveli, India, pp. 908-913, 2017. [112] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, "Attention is all you need," Proceedings of the 31st International Conference on Neural Information Processing Systems, Red Hook, NY, USA, pp. 6000–6010, 2017. [113] P. Viola and M. Jones, "Rapid object detection using a boosted cascade of simple features," Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001, Kauai, HI, USA, vol. 1, pp. I-I, 2001. [114] P. Virtanen, R. Gommers, T. E. Oliphant, M. Haberland, T. Reddy, D. Cournapeau, E. Burovski, P. Peterson, W. Weckesser, J. Bright, S. J. van der Walt, M. Brett, J. Wilson, K. J. Millman, N. Mayorov, A. R. J. Nelson, E. Jones, R. Kern, E. Larson, C. J. Carey, İ. Polat, Y. Feng, E. W. Moore, J. VanderPlas, D. Laxalde, J. Perktold, R. Cimrman, I. Henriksen, E. A. Quintero, C. R. Harris, A. M. Archibald, A. H. Ribeiro, F. Pedregosa, P. van Mulbregt, A. Vijaykumar, A. P. Bardelli, A. Rothberg, A. Hilboll, A. Kloeckner, A. Scopatz, A. Lee, A. Rokem, C. N. Woods, C. Fulton, C. Masson, C. Häggström, C. Fitzgerald, D. A. Nicholson, D. R. Hagen, D. V. Pasechnik, E. Olivetti, E. Martin, E. Wieser, F. Silva, F. Lenders, F. Wilhelm, G. Young, G. A. Price, G.-L. Ingold, G. E. Allen, G. R. Lee, H. Audren, I. Probst, J. P. Dietrich, J. Silterra, J. T. Webber, J. Slavič, J. Nothman, J. Buchner, J. Kulick, J. L. Schönberger, J. V. de Miranda Cardoso, J. Reimer, J. Harrington, J. L. C. Rodríguez, J. Nunez-Iglesias, J. Kuczynski, K. Tritz, M. Thoma, M. Newville, M. Kümmerer, M. Bolingbroke, M. Tartre, M. Pak, N. J. Smith, N. Nowaczyk, N. Shebanov, O. Pavlyk, P. A. Brodtkorb, P. Lee, R. T. McGibbon, R. Feldbauer, S. Lewis, S. Tygier, S. Sievert, S. Vigna, S. Peterson, S. More, T. Pudlik, T. Oshima, T. J. Pingel, T. P. Robitaille, T. Spura, T. R. Jones, T. Cera, T. Leslie, T. Zito, T. Krauss, U. Upadhyay, Y. O. Halchenko, Y. Vázquez-Baeza and C. SciPy, "SciPy 1.0: fundamental algorithms for scientific computing in Python," Nature Methods, vol. 17, no. 3, pp. 261-272, 2020. [115] M. Vlachos, G. Kollios, and D. Gunopulos, "Discovering similar multidimensional trajectories," Proceedings of 18th International Conference on Data Engineering, San Jose, CA, USA, pp. 673-684, 2002. [116] J. Wagner, A. Triantafyllopoulos, H. Wierstorf, M. Schmitt, F. Eyben, and B. Schuller, "Dawn of the Transformer Era in Speech Emotion Recognition: Closing the Valence Gap," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, pp. 10745-10759, 2022. [117] F. Wang, M. Jiang, C. Qian, S. Yang, C. Li, H. Zhang, X. Wang, and X. Tang, "Residual attention network for image classification," 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, pp. 6450-6458, 2017. [118] H. Y. Wang and H. P. Huang, "Interaction System Based on Exercise Assistance and Cognitive Training Game for Older Adults with Mild Cognitive Impairment," Master Thesis, Department of Mechanical Engineering, National Taiwan University, 2022. [119] X. Wang, C. Yu, Y. Gu, M. Hu, and F. Ren, "Multi‐Task and Attention Collaborative Network for Facial Emotion Recognition," IEEJ Transactions on Electrical and Electronic Engineering, vol. 16, 2021. [120] J. Weizenbaum, "ELIZA—a computer program for the study of natural language communication between man and machine," Commun. ACM, vol. 9, no. 1, pp. 36–45 , numpages = 10, 1966. [121] T.-H. Wen, D. Vandyke, N. Mrkšić, M. Gašić, L. M. Rojas-Barahona, P.-H. Su, S. Ultes, and S. Young, "A Network-based End-to-End Trainable Task-oriented Dialogue System," Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers, Valencia, Spain, pp. 438-449, 2017. [122] W. Xiaohua, P. Muzi, P. Lijuan, H. Min, J. Chunhua, and R. Fuji, "Two-level attention with two-stage multi-task learning for facial emotion recognition," Journal of Visual Communication and Image Representation, vol. 62, pp. 217-225, 2019. [123] F. Xu and Z. Wang, "A facial expression recognition method based on cubic spline interpolation and HOG features," 2017 IEEE International Conference on Robotics and Biomimetics (ROBIO), Macau, Macao, pp. 1300-1305, 2017. [124] P. L. Yang and H. P. Huang, "Human–Robot Interaction Framework with Continuous Emotion Recognition for People with Mild Cognitive Impairment," Master Thesis, Department of Mechanical Engineering, National Taiwan University, 2022. [125] Q. Yang, H. Alamro, S. Albaradei, A. Salhi, X. Lv, C. Ma, M. Alshehri, I. Jaber, F. Tifratene, and W. Wang, "SenWave: Monitoring the global sentiments under the COVID-19 pandemic," arXiv preprint arXiv:2006.10842, 2020. [126] T. Ying, C. Rui, and C. Yong, "Facial expression recognition algorithm using LGC based on horizontal and diagonal prior principle," Optik, vol. 125, no. 16, pp. 4186-4189, 2014. [127] S. Young, M. Gašić, B. Thomson, and J. D. Williams, "POMDP-Based Statistical Spoken Dialog Systems: A Review," Proceedings of the IEEE, vol. 101, no. 5, pp. 1160-1179, 2013. [128] L.-C. Yu, L.-H. Lee, S. Hao, J. Wang, Y. He, J. Hu, K. R. Lai, and X. Zhang, "Building Chinese Affective Resources in Valence-Arousal Dimensions," Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego, California, pp. 540-545, 2016. [129] K. Zhang, Z. Zhang, Z. Li, and Y. Qiao, "Joint Face Detection and Alignment Using Multitask Cascaded Convolutional Networks," IEEE Signal Processing Letters, vol. 23, no. 10, pp. 1499-1503, 2016. [130] S. Zhang, X. Zhao, and Q. Tian, "Spontaneous Speech Emotion Recognition Using Multiscale Deep Convolutional LSTM," IEEE Transactions on Affective Computing, vol. 13, no. 2, pp. 680-688, 2022. [131] X. Zhang, B. Peng, K. Li, J. Zhou, and H. Meng, "SGP-TOD: Building Task Bots Effortlessly via Schema-Guided LLM Prompting," Findings of the Association for Computational Linguistics: EMNLP 2023, Singapore, pp. 13348-13369, 2023. [132] Q. Zhou, u. R. Shafiq, Y. Zhou, X. Wei, L. Wang, and B. Zheng, "Face recognition using dense sift feature alignment," Chinese Journal of Electronics, vol. 25, no. 6, pp. 1034-1039, 2016. [133] E. Zotcheva, B. Bratsberg, B. H. Strand, A. Jugessur, B. L. Engdahl, C. Bowen, G. Selbæk, H. P. Kohler, J. R. Harris, J. Weiss, S. E. Tom, S. Krokstad, T. Mekonnen, T. H. Edwin, Y. Stern, A. K. Håberg, and V. Skirbekk, "Trajectories of occupational physical activity and risk of later-life mild cognitive impairment and dementia: the HUNT4 70+ study," The Lancet Regional Health – Europe, vol. 34, 2023. | - |
| dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/94751 | - |
| dc.description.abstract | 隨著全球人口的高齡化,失智症,這種因異常腦部變化導致並好發於長者的症候群,已成為社會中的重大議題。輕度認知障礙作為介於正常老化與失智症之間的階段,及早檢測可以提供即時的介入治療,從而提高患者的生活品質並可能防止其進一步惡化。為此,我們提出了一種基於情緒感知的認知功能障礙檢測方法。為了實現這一方法,我們建立了一個人機互動框架,包括人臉辨識系統、多語言對話系統、情緒辨識系統以及我們實驗室之前開發的運動輔助系統。我們提出的認知功能障礙檢測方法利用從對話和運動人機互動中收集的數據來分類三種認知能力階段:正常、輕度認知障礙和高度認知障礙。
為了評估此方法的可行性和有效性,我們在桃園龜山的大崗、嶺頂與陸光社區招募了七十名年齡介於58至97歲之間的受試者。在實驗中,受試者與機器人進行論文中提出的對話和運動互動,並利用蒙特利爾認知評估-台灣版 (MoCA-T)來評估受試者的認知功能。我們將根據MoCA-T結果分類出的認知能力階段作為基準真相 (ground truth),並使用支援向量機 (Support Vector Machine)作為我們都分類模型。實驗結果顯示,我們的方法在分類認知能力階段方面達到了92.3%的準確率。 | zh_TW |
| dc.description.abstract | With the global population aging, dementia, a syndrome caused by abnormal brain changes and prevalent among older people, has become a significant issue in society. Mild cognitive impairment (MCI) as the stage between normal aging and dementia. Ear-ly detection of cognitive impairment allows for timely intervention and treatment, im-proving the quality of life for individuals and potentially preventing its progression. Therefore, we proposed a cognitive function impairment detection method based on emotion perception. To implement this method, we establish a human-robot interaction framework consisting of a face recognition system, a multilingual dialogue system, an emotion recognition system, and an exercise assistance system previously developed by our laboratory. Our proposed cognitive function impairment detection method utilizes the data collected from the dialogue and exercise human-robot interactions to classify three cognitive function levels: normal, mild cognitive impairment, and highly cognitive impairment.
To evaluate the feasibility and efficacy of our method, we recruited seventy participants aged 58 to 97 in Dagang, Lingding, and Luguang communities of Guishan, Taoyuan. In the experiment, participants engaged in both dialogue and exercise interactions with the robot. The Taiwan version of Montreal Cognitive Assessment (MoCA-T) was utilized to evaluate the cognitive function of the participants. Ground truth cognitive function levels were categorized based on the MoCA-T results. We employed Support Vector Machine (SVM) as our classification model. As the results, our method achieved 92.3% accuracy in classifying cognitive levels. | en |
| dc.description.provenance | Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2024-08-16T17:59:19Z No. of bitstreams: 0 | en |
| dc.description.provenance | Made available in DSpace on 2024-08-16T17:59:19Z (GMT). No. of bitstreams: 0 | en |
| dc.description.tableofcontents | 中文口試委員會審定書 i
英文口試委員會審定書 iii 誌謝 v 摘要 vii Abstract ix List of Tables xv List of Figures xvii Chapter 1 Introduction 1 1.1 Motivation 1 1.2 Contributions 3 1.3 Organization of Thesis 4 Chapter 2 Related Works 7 2.1 Dialogue System 7 2.1.1 Task-Oriented Dialogue 7 2.1.2 Open-Domain Dialogue 10 2.2 Emotion Recognition 10 2.2.1 Emotion Representation 11 2.2.2 Emotion Recognition in Different Modalities 13 2.3 Cognitive Function Impairment Detection 25 2.3.1 Machine Learning Applications on Detecting Cognitive Function Impairment 26 2.4 Human-Robot Interaction 27 2.4.1 Social Cues and Social Signals 27 2.4.2 Human-Robot Proxemics 29 Chapter 3 Human-Robot Interaction 31 3.1 Overview of the Framework 31 3.2 Face Recognition System 32 3.2.1 Face Detection 32 3.2.2 Face Alignment 33 3.2.3 Facial Feature Extraction and Facial Feature Matching 35 3.3 Multilingual Dialogue System 37 3.3.1 Speech Recognition 37 3.3.2 Natural Language Understanding 41 3.3.3 Natural Language Generation 49 3.3.4 Speech Synthesis 51 3.4 Emotion Recognition System 53 3.4.1 Emotion Recognition with Facial and Speech Modalities 53 3.4.2 Emotion Recognition with Textual Modality 59 3.4.3 Hidden Markov Model for Engagement 63 3.4.4 Robot Emotional Expression Module 63 3.5 Exercise Assistance System 68 3.5.1 Posture Feature Extraction 70 3.5.2 Characteristic Angles 75 3.5.3 Method for Grading Dynamic Posture 76 3.6 Summary 80 Chapter 4 Cognitive Function Impairment Detection 81 4.1 Cognitive Assessment 81 4.2 Human-Robot Interaction 83 4.3 Cognitive Level Classification 89 Chapter 5 Experiments 93 5.1 Hardware Configuration 93 5.1.1 Mobi Robot 93 5.2 Experiment Scenarios and Results 100 5.2.1 Emotion Recognition System 100 5.2.2 Multilingual Dialogue System 106 5.2.3 Cognitive Function Impairment Detection 129 5.3 Discussion 139 Chapter 6 Conclusions and Future Work 141 6.1 Conclusions 141 6.2 Future Work 142 References 145 | - |
| dc.language.iso | en | - |
| dc.subject | 人臉辨識 | zh_TW |
| dc.subject | 人機互動 | zh_TW |
| dc.subject | 失智症 | zh_TW |
| dc.subject | 輕度認知障礙 | zh_TW |
| dc.subject | 情緒感知 | zh_TW |
| dc.subject | 情緒辨識 | zh_TW |
| dc.subject | Human-Robot Interaction | en |
| dc.subject | Dementia | en |
| dc.subject | Mild Cognitive Impairment | en |
| dc.subject | Face Recognition | en |
| dc.subject | Emotion Recognition | en |
| dc.subject | Emotion Perception | en |
| dc.title | 基於情緒感知的長者認知功能障礙偵測 | zh_TW |
| dc.title | Emotion Perception-based Identification of Cognitive Function Impairment for Older People | en |
| dc.type | Thesis | - |
| dc.date.schoolyear | 112-2 | - |
| dc.description.degree | 碩士 | - |
| dc.contributor.oralexamcommittee | 李祖聖;劉益宏;林峻永 | zh_TW |
| dc.contributor.oralexamcommittee | Tzuu-Hseng S. Li ;Yi-Hung Liu;Chun-Yeon Lin | en |
| dc.subject.keyword | 人機互動,人臉辨識,情緒辨識,情緒感知,輕度認知障礙,失智症, | zh_TW |
| dc.subject.keyword | Human-Robot Interaction,Emotion Recognition,Face Recognition,Emotion Perception,Mild Cognitive Impairment,Dementia, | en |
| dc.relation.page | 156 | - |
| dc.identifier.doi | 10.6342/NTU202403458 | - |
| dc.rights.note | 未授權 | - |
| dc.date.accepted | 2024-08-09 | - |
| dc.contributor.author-college | 工學院 | - |
| dc.contributor.author-dept | 機械工程學系 | - |
| 顯示於系所單位: | 機械工程學系 | |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| ntu-112-2.pdf 未授權公開取用 | 8.77 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
