以台灣年長者為對象之基於深度卷積類神經網路的自動臉部表情辨識

De-Wei Ye; 葉德緯

Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/74637

Full metadata record

???org.dspace.app.webui.jsptag.ItemTag.dcfield???	Value	Language
dc.contributor.advisor	鄭士康(Shyh-Kang Jeng)
dc.contributor.author	De-Wei Ye	en
dc.contributor.author	葉德緯	zh_TW
dc.date.accessioned	2021-06-17T08:47:08Z	-
dc.date.available	2021-02-22
dc.date.copyright	2021-02-22
dc.date.issued	2021
dc.date.submitted	2021-02-02
dc.identifier.citation	[1] World Health Organization, “Global health and aging,” 2011. https://www.who.int/ageing/publications/global_health.pdf. [2] C.-P. Liu, “Empathetic generative-based chatbot with emotion understanding via reinforcement learning,” Master’s thesis, Department of Electrical Engineering, National Taiwan University, 2020. [3] J.-H. Jhan, “Empathetic and retrieval-based chatbot using deep reinforcement learning,” Master’s thesis, Graduate Institute of Communication Engineering, National Taiwan University, 2020. [4] C. Sirithunge, A. B. P. Jayasekara, and D. Chandima, “Proactive robots with the perception of nonverbal human behavior: A review,” IEEE Access, vol. 7, pp. 77308–77327, 2019. [5] S. Li and W. Deng, “Deep facial expression recognition: A survey,” IEEE Transactions on Affective Computing, 2020. [6] J. C. Borod, S. A. Yecker, A. M. Brickman, C. R. Moreno, M. Sliwinski, N. S. Foldi, M. Alpert, and J. Welkowitz, “Changes in posed facial expression of emotion across the adult life span,” Experimental aging research, vol. 30, no. 4, pp. 305–331, 2004. [7] M. Riediger, M. C. Voelkle, N. C. Ebner, and U. Lindenberger, “Beyond “happy, angry, or sad?”: Age-of-poser and age-of-rater effects on multi-dimensional emotion perception,” Cognition and Emotion, vol. 25, no. 6, pp. 968–982, 2011. [8] A. S. Vyas, H. B. Prajapati, and V. K. Dabhi, “Survey on face expression recognition using CNN,” in 2019 5th International Conference on Advanced Computing Communication Systems (ICACCS), pp. 102–106, IEEE, 2019. [9] P. V. Rouast, M. Adam, and R. Chiong, “Deep learning for human affect recognition: insights and new developments,” IEEE Transactions on Affective Computing, 2019. [10] Den Uyl, M. J., and H. Van Kuilenburg. 'The FaceReader: Online facial expression recognition.' Proceedings of measuring behavior. Vol. 30. No. 2. Wageningen, 2005. [11] R. Ekman, What the face reveals: Basic and applied studies of spontaneous expression using the Facial Action Coding System (FACS). Oxford University Press, USA, 1997. [12] Y.-Z. Tu, D.-W. Lin, A. Suzuki, and J. O. S. Goh, “East Asian young and older adult perceptions of emotional faces from an age-and sex-fair East Asian facial expression database,” Frontiers in Psychology, vol. 9, p. 2358, 2018. [13] J. G. Daugman, “Uncertainty relation for resolution in space, spatial frequency, and orientation optimized by two-dimensional visual cortical filters,” Journal of the Optical Society of America A, vol. 2, no. 7, pp. 1160–1169, 1985. [14] G. Guo, R. Guo, and X. Li, “Facial expression recognition influenced by human aging,” IEEE Transactions on Affective Computing, vol. 4, no. 3, pp. 291–298, 2013. [15] T. F. Cootes, C. J. Taylor, D. H. Cooper, and J. Graham, “Active shape models-their training and application,” Computer Vision and Image Understanding, vol. 61, no. 1, pp. 38–59, 1995. [16] T. F. Cootes, G. J. Edwards, and C. J. Taylor, “Active appearance models,” in European Conference on Computer Vision, pp. 484–498, Springer, 1998. [17] T. Ojala, M. Pietikainen, and T. Maenpaa, “Multiresolution gray-scale and rotation invariant texture classification with local binary patterns,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 7, pp. 971–987, 2002. [18] N. Algaraawi and T. Morris, “Study on aging effect on facial expression recognition,” in World Congress on Engineering; London, UK, 2016. [19] E. Lozano-Monasor, M. T. López, F. Vigo-Bustos, and A. Fernández-Caballero, “Facial expression recognition in ageing adults: from lab to ambient assisted living,” Journal of Ambient Intelligence and Humanized Computing, vol. 8, no. 4, pp. 567578, 2017. [20] A. Caroppo, A. Leone, and P. Siciliano, “Comparison between deep learning models and traditional machine learning approaches for facial expression recognition in aging adults,” Journal of Computer Science and Technology, vol. 35, no. 5, pp. 1127–1146, 2020. [21] A. Caroppo, A. Leone, and P. Siciliano, “Facial expression recognition in aging adults: A comparative study,” in Italian Forum of Ambient Assisted Living, pp. 349–359, Springer, 2018. [22] N. C. Ebner, M. Riediger, and U. Lindenberger, “Faces—a database of facial expressions in young, middle-aged, and older women and men: Development and validation,” Behavior Research Methods, vol. 42, no. 1, pp. 351–362, 2010. [23] M. Minear and D. C. Park, “A lifespan database of adult facial stimuli,” Behavior Research Methods, Instruments, and Computers, vol. 36, no. 4, pp. 630–633, 2004. [24] W. Li, M. Li, Z. Su, and Z. Zhu, “A deep-learning approach to facial expression recognition with candid images,” in 2015 14th IAPR International Conference on Machine Vision Applications (MVA), pp. 279–282, IEEE, 2015. [25] I. J. Goodfellow, D. Erhan, P. L. Carrier, A. Courville, M. Mirza, B. Hamner, W. Cukierski, Y. Tang, D. Thaler, D.-H. Lee, et al., “Challenges in representation learning: A report on three machine learning contests,” in International Conference on Neural Information Processing, pp. 117–124, Springer, 2013. [26] P. Ekman and W. V. Friesen, Unmasking the Face: A Guide to Recognizing Emotions from Facial Clues. Ishk, 2003. [27] P. Ekman, “Basic emotions,” Handbook of cognition and emotion, vol. 98, no. 45-60, p. 16. [28] J. Posner, J. A. Russell, and B. S. Peterson, “The circumplex model of affect: An integrative approach to affective neuroscience, cognitive development, and psychopathology,” Development and Psychopathology, vol. 17, no. 3, p. 715, 2005. [29] D. C. Rubin and J. M. Talarico, “A comparison of dimensional models of emotion: Evidence from emotions, prototypical events, autobiographical memories, and words,” Memory, vol. 17, no. 8, pp. 802–808, 2009. [30] M. J. Lyons, S. Akamatsu, M. Kamachi, J. Gyoba, and J. Budynek, “The Japanese female facial expression (JAFFE) database,” in Proceedings of the third international conference on automatic face and gesture recognition, pp. 14–16, 1998. [31] M. J. Lyons, M. Kamachi, and J. Gyoba, “Coding facial expressions with Gabor wavelets (IVC special issue),” arXiv preprint arXiv:2009.05938, 2020. [32] P. Lucey, J. F. Cohn, T. Kanade, J. Saragih, Z. Ambadar, and I. Matthews, “The extended Cohn-Kanade dataset (CK+): A complete dataset for action unit and emotion-specified expression,” in 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, pp. 94–101, 2010. [33] E. Barsoum, C. Zhang, C. Canton Ferrer, and Z. Zhang, “Training deep networks for facial expression recognition with crowd-sourced label distribution,” in ACM International Conference on Multimodal Interaction (ICMI), 2016. [34] P. Ekman and K. G. Heider, “The universality of a contempt expression: a replication,” Motivation and Emotion, vol. 12, no. 3, pp. 303–308, 1988. [35] A. Ben-Hur, D. Horn, H. T. Siegelmann, and V. Vapnik, “Support vector clustering,” Journal of Machine Learning Research, vol. 2, no. Dec, pp. 125–137, 2001. [36] N. S. Altman, “An introduction to kernel and nearest-neighbor nonparametric regression,” The American Statistician, vol. 46, no. 3, pp. 175–185, 1992. [37] A. Mollahosseini, D. Chan, and M. H. Mahoor, “Going deeper in facial expression recognition using deep neural networks,” in 2016 IEEE Winter conference on applications of computer vision (WACV), pp. 1–10, IEEE, 2016. [38] B. Widrow and M. A. Lehr, “30 years of adaptive neural networks: perceptron, Madeline, and backpropagation,” Proceedings of the IEEE, vol. 78, no. 9, pp. 14151442, 1990. [39] I. Goodfellow, Y. Bengio, and A. Courville, Deep learning. MIT Press, 2016. [40] M.-J. Lu, “Attention-based deep multiple instance learning for diagnosis aid and discriminative patch localization of mild cognitive impairment and Alzheimer’s disease from brain MRIs,” Master’s thesis, Department of Electrical Engineering, National Taiwan University, 2019. [41] D. L. Yamins, H. Hong, C. F. Cadieu, E. A. Solomon, D. Seibert, and J. J. DiCarlo, “Performance-optimized hierarchical models predict neural responses in higher visual cortex,” Proceedings of the National Academy of Sciences, vol. 111, no. 23, pp. 8619–8624, 2014. [42] D. L. Yamins and J. J. DiCarlo, “Using goal-driven deep learning models to understand sensory cortex,” Nature Neuroscience, vol. 19, no. 3, p. 356, 2016. [43] M. Eickenberg, A. Gramfort, G. Varoquaux, and B. Thirion, “Seeing it all: Convolutional network layers map the function of the human visual system,” NeuroImage, vol. 152, pp. 184–194, 2017. [44] J. E. Van Engelen and H. H. Hoos, “A survey on semi-supervised learning,” Machine Learning, vol. 109, no. 2, pp. 373–440, 2020. [45] S. Sun, Z. Cao, H. Zhu, and J. Zhao, “A survey of optimization methods from a machine learning perspective,” IEEE Transactions on Cybernetics, 2019. [46] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014. [47] D. M. Hawkins, “The problem of overfitting,” Journal of Chemical Information and Computer Sciences, vol. 44, no. 1, pp. 1–12, 2004. [48] S. J. Pan and Q. Yang, “A survey on transfer learning,” IEEE Transactions on Knowledge and Data Engineering, vol. 22, no. 10, pp. 1345–1359, 2009. [49] M. Wang and W. Deng, “Deep visual domain adaptation: A survey,” Neurocomputing, vol. 312, pp. 135–153, 2018. [50] J. Yosinski, J. Clune, Y. Bengio, and H. Lipson, “How transferable are features in deep neural networks?” in Advances in Neural Information Processing Systems, pp. 3320–3328, 2014. [51] B. Chu, V. Madhavan, O. Beijbom, J. Hoffman, and T. Darrell, “Best practices for fine-tuning visual classifiers to new domains,” in European Conference on Computer Vision, pp. 435–442, Springer, 2016. [52] A. Das and P. Rad, “Opportunities and challenges in explainable artificial intelligence (XAI): A survey,” arXiv preprint arXiv:2006.11371, 2020. [53] D. Castelvecchi, “Can we open the black box of AI?” Nature News, vol. 538, no. 7623, p. 20, 2016. [54] F. Li and Y. Du, “From AlphaGo to power system AI: What engineers can learn from solving the most complex board game,” IEEE Power and Energy Magazine, vol. 16, no. 2, pp. 76–84, 2018. [55] M. T. Ribeiro, S. Singh, and C. Guestrin, “Why should I trust you? explaining the predictions of any classifier,” in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1135–1144, 2016. [56] B. Goodman and S. Flaxman, “European Union regulations on algorithmic decision-making and a “right to explanation,” AI Magazine, vol. 38, no. 3, pp. 50–57, 2017. [57] A. B. Arrieta, N. Díaz-Rodríguez, J. Del Ser, A. Bennetot, S. Tabik, A. Barbado, S. García, S. Gil-López, D. Molina, R. Benjamins, et al., “Explainable artificial intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible ai,” Information Fusion, vol. 58, pp. 82–115, 2020. [58] K. Simonyan, A. Vedaldi, and A. Zisserman, “Deep inside convolutional networks: Visualising image classification models and saliency maps,” arXiv preprint arXiv:1312.6034, 2013. [59] S. Bach, A. Binder, G. Montavon, F. Klauschen, K.-R. Müller, and W. Samek, “On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation,” PLOS ONE, vol. 10, no. 7, p. e0130140, 2015. [60] B. Zhou, A. Khosla, A. Lapedriza, A. Oliva, and A. Torralba, “Learning deep features for discriminative localization,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2921–2929, 2016. [61] J. T. Springenberg, A. Dosovitskiy, T. Brox, and M. Riedmiller, “Striving for simplicity: The all convolutional net,” arXiv preprint arXiv:1412.6806, 2014. [62] R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, and D. Batra, “GradCAM: Visual explanations from deep networks via gradient-based localization,” in Proceedings of the IEEE International Conference on Computer Vision, pp. 618–626, 2017. [63] J. Yosinski, J. Clune, A. Nguyen, T. Fuchs, and H. Lipson, “Understanding neural networks through deep visualization,” arXiv preprint arXiv:1506.06579, 2015. [64] C. Olah, A. Mordvintsev, and L. Schubert, “Feature visualization,” Distill, vol. 2, no. 11, p. e7, 2017. [65] K. B. Weitz, “Applying explainable artificial intelligence for deep learning networks to decode facial expressions of pain and emotions,” Master’s thesis, Cognitive Systems Group, Faculty of Information Systems and Applied Computer Sciences, Otto-Friedrich-University Bamberg, 2018. [66] A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, et al., “Pytorch: An imperative style, high-performance deep learning library,” in Advances in neural information processing systems, pp. 8026–8037, 2019. https://pytorch.org/. [67] R. Keys, “Cubic convolution interpolation for digital image processing,” IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 29, no. 6, pp. 11531160, 1981. [68] K. Zhang, Z. Zhang, Z. Li, and Y. Qiao, “Joint face detection and alignment using multitask cascaded convolutional networks,” IEEE Signal Processing Letters, vol. 23, no. 10, pp. 1499–1503, 201 [69] C. Shorten and T. M. Khoshgoftaar, “A survey on image data augmentation for deep learning,” Journal of Big Data, vol. 6, no. 1, p. 60, 2019. [70] K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv preprint arXiv:1409.1556, 2014. [71] V. Nair and G. E. Hinton, “Rectified linear units improve restricted Boltzmann machines,” in International Conference on Machine Learning (ICML), 2010. [72] S. Li, W. Deng, and J. Du, “Reliable crowdsourcing and deep locality-preserving learning for expression recognition in the wild,” in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2584–2593, IEEE, 2017. [73] L.-F. Chen and Y.-S. Yen, “Taiwanese facial expression image database.” rain Mapping Laboratory, Institute of Brain Science, National Yang-Ming University, 2007. http://bml.ym.edu.tw/tfeid/index.php. [74] Z. Zhang, P. Luo, C.-C. Loy, and X. Tang, “Learning social relation traits from face images,” in Proceedings of the IEEE International Conference on Computer Vision, pp. 3631–3639, 2015. [75] Z. Zhang, P. Luo, C. C. Loy, and X. Tang, “From facial expression recognition to interpersonal relation prediction,” International Journal of Computer Vision, vol. 126, no. 5, pp. 550–569, 2018. [76] Z. Lian, Y. Li, J.-H. Tao, J. Huang, and M.-Y. Niu, “Expression analysis based on face regions in read-world conditions,” International Journal of Automation and Computing, vol. 17, no. 1, pp. 96–107, 2020. [77] C.-C. Chen, S.-L. Cho, and R.-Y. Tseng, “Taiwan corpora of Chinese emotions and relevant psychophysiological data-behavioral evaluation norm for facial expressions of professional performer.,” Chinese Journal of Psychology, 2013. [78] Y. Sasaki, “The truth of the f-measure.” https://www.toyota-ti.ac.jp/Lab/Denshi/COIN/people/yutaka.sasaki/F-measure-YS-26Oct07.pdf. [79] C.-M. Kuo, S.-H. Lai, and M. Sarkis, “A compact deep learning model for robust facial expression recognition,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 2121–2129, 2018. [80] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778, 2016. [81] A. More, “Survey of resampling techniques for improving classification performance in unbalanced datasets,” arXiv preprint arXiv:1608.06048, 2016. [82] K. Wang, X. Peng, J. Yang, S. Lu, and Y. Qiao, “Suppressing uncertainties for largescale facial expression recognition,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6897–6906, 2020. [83] J. Zeng, S. Shan, and X. Chen, “Facial expression recognition with inconsistently annotated datasets,” in Proceedings of the European conference on computer vision (ECCV), pp. 222–237, 2018. [84] C. F. Benitez-Quiroz, R. Srinivasan, Q. Feng, Y. Wang, and A. M. Martinez, “EmotioNet challenge: Recognition of facial expressions of emotion in the wild,” arXiv preprint arXiv:1703.01210, 2017. [85] A. Mollahosseini, B. Hasani, and M. H. Mahoor, “Affectnet: A database for facial expression, valence, and arousal computing in the wild,” IEEE Transactions on Affective Computing, vol. 10, no. 1, pp. 18–31, 2017.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/74637	-
dc.description.abstract	本論文以深度卷積類神經網路實作可用於台灣年長者的臉部表情辨識模型，並探討年齡效應對於卷積類神經網路模型的表情認知影響。該模型結合多個臉部表情圖片的資料庫訓練，並搭配適當的數據平衡，使模型具有跨情境的穩健預測準確度。另藉由遷移學習領域的微調方法，能使模型在少量額外資料的幫助下，使台灣年長者的臉部表情辨識準確率進一步提升。實驗結果顯示，卷積類神經網路模型對台灣人的臉部表情辨識整體準確度優於人類與使用人工特徵的傳統電腦方法，其對不同年齡族群的辨識結果差異也較人類與傳統方法小。如同人類以及傳統電腦方法，卷積類神經網路模型從老人表情中接受到的情緒強度仍然比年輕人弱。而藉由應用可解釋人工智慧的方法，我們也視覺化卷積類神經網路模型分類的依據，並發現臉部肌肉的弱化與皺紋影響卷積類神經網路模型。整體而言，年齡效應對卷積類神經網路模型的影響與人類有相似之處，但實際上仍比較類似傳統人工特徵方法。	zh_TW
dc.description.abstract	This thesis has implemented a robust cross-dataset facial expression recognition system for Taiwanese elders based on a deep convolutional neural network (CNN). We investigate how aging affects the recognition of expressions for the CNN model. The CNN model is trained with combined datasets with data balancing to let the model be robust against unseen datasets. Also, the recognition performance of Taiwanese elders is improved via fine-tuning on small amounts of extra data. The CNN model outperforms human raters and the handcrafted-feature method on Taiwanese faces. The recognition accuracy difference between old faces and young faces is smaller than human raters and the handcrafted-feature method. Still, the CNN model perceives weaker emotional intensity on old faces, and this property is similar to human raters and the handcrafted-feature method. With the assistance of the XAI method, we can visualize the discriminative parts in the CNN model. The visualization implies that weakened facial muscles and wrinkles still affect the expression recognition for the CNN model. Overall, the CNN model resembles the handcrafted-feature method more than human raters when perceiving facial expressions for the elderly.	en
dc.description.provenance	Made available in DSpace on 2021-06-17T08:47:08Z (GMT). No. of bitstreams: 1 U0001-1601202120575800.pdf: 4983092 bytes, checksum: 17b051f44929423954242d25bac23efe (MD5) Previous issue date: 2021	en
dc.description.tableofcontents	口試委員會審定書 i 誌謝 ii 中文摘要 iii ABSTRACT iv CONTENTS v LIST OF FIGURES vii LIST OF TABLES viii Chapter 1 Introduction 1 1.1 Motivation 1 1.2 Problem Statement 2 1.3 Literature Reviews 3 1.4 Contributions 6 1.5 Chapter Outline 6 Chapter 2 Background Knowledge 7 2.1 Expressions Datasets and Basic Expressions 7 2.2 FER with CNNs 9 2.3 Supervised Learning 12 2.4 Transfer Learning and Fine-Tuning 14 2.5 Explainable Artificial Intelligence 16 Chapter 3 System Design 21 3.1 Data Preprocessing 22 3.1.1 Face Alignment 22 3.1.2 Data Augmentation 23 3.1.3 Data Normalization 23 3.2 CNN Architecture 24 Chapter 4 Datasets 27 4.1 Common Facial Expression Datasets 27 4.2 East Asian Facial Expression Database 33 Chapter 5 Experiment Setup 35 5.1 CNN for Cross-Dataset FER 35 5.1.1 Datasets and Evaluation Metrics 36 5.1.2 CNN Architectures 37 5.1.3 Input Image Size 38 5.1.4 Data Balance 38 5.2 Fine-Tuning 40 5.3 Visualizing Discriminative Parts 41 Chapter 6 Results and Discussion 42 6.1 CNN for Cross-Dataset FER 42 6.2 Aging Effects 47 6.3 Visualizing Discriminative Parts 52 6.4 Fine-Tuning 56 Chapter 7 Conclusion 59 Appendix A Code Release 60 REFERENCE 61
dc.language.iso	en
dc.subject	可解釋人工智慧	zh_TW
dc.subject	遷移學習	zh_TW
dc.subject	深度學習	zh_TW
dc.subject	年長者看護	zh_TW
dc.subject	臉部表情辨識	zh_TW
dc.subject	Deep learning	en
dc.subject	Elderly care	en
dc.subject	Facial expression recognition	en
dc.subject	Explainable artificial intelligence	en
dc.subject	Transfer learning	en
dc.title	以台灣年長者為對象之基於深度卷積類神經網路的自動臉部表情辨識	zh_TW
dc.title	Automatic Facial Expression Recognition for Taiwanese Elders with Deep Convolutional Neural Network	en
dc.type	Thesis
dc.date.schoolyear	109-1
dc.description.degree	碩士
dc.contributor.oralexamcommittee	李宏毅(Hung-Yi Lee),吳恩賜(Joshua Goh)
dc.subject.keyword	臉部表情辨識,年長者看護,深度學習,遷移學習,可解釋人工智慧,	zh_TW
dc.subject.keyword	Facial expression recognition,Elderly care,Deep learning,Transfer learning,Explainable artificial intelligence,	en
dc.relation.page	70
dc.identifier.doi	10.6342/NTU202100072
dc.rights.note	有償授權
dc.date.accepted	2021-02-03
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	電信工程學研究所	zh_TW
Appears in Collections:	電信工程學研究所

Files in This Item:

File	Size	Format
U0001-1601202120575800.pdf Restricted Access	4.87 MB	Adobe PDF

Show simple item record

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets