Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊工程學系
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/69904
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor許永真
dc.contributor.authorCheng-Hao Tuen
dc.contributor.author屠政皓zh_TW
dc.date.accessioned2021-06-17T03:33:20Z-
dc.date.available2018-03-02
dc.date.copyright2018-03-02
dc.date.issued2018
dc.date.submitted2018-02-13
dc.identifier.citation[1] N. Ambady and R. Rosenthal. Thin slices of expressive behavior as predictors of interpersonal consequences: A meta-analysis. Psychological bulletin, 111(2):256, 1992.
[2] T. BaltruÇsaitis, M. Mahmoud, and P. Robinson. Cross-dataset learning and person-specific normalisation for automatic action unit detection. In Automatic Face and Gesture Recognition (FG), 2015 11th IEEE International Conference and Workshops on, volume 6, pages 1–6. IEEE, 2015.
[3] S. A. Bargal, E. Barsoum, C. C. Ferrer, and C. Zhang. Emotion recognition in the wild from videos using images. In Proceedings of the 18th ACM International Conference on Multimodal Interaction, pages 433–436. ACM, 2016.
[4] J. Bromley, I. Guyon, Y. LeCun, E. Sackinger, and R. Shah. Signature verification using a siamese time delay neural network. In Advances in Neural Information Processing Systems, pages 737–744, 1994.
[5] J. Chen, X. Liu, P. Tu, and A. Aragones. Learning person-specific models for facial expression and action unit recognition. Pattern Recognition Letters, 34(15):1964–1970, 2013.
[6] W.-S. Chu, F. D. la Torre, and J. F. Cohn. Selective transfer machine for personalized facial action unit detection. In Computer Vision and Pattern Recognition (CVPR), 2013 IEEE Conference on, pages 3515–3522. IEEE, 2013.
[7] J. Dai, K. He, and J. Sun. Instance-aware semantic segmentation via multi-task network cascades. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 3150–3158, 2016.
[8] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. Imagenet: A large-scale hierarchical image database. In Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on, pages 248–255. IEEE, 2009.
[9] H. Ding, S. K. Zhou, and R. Chellappa. Facenet2expnet: Regularizing a deep face recognition net for expression recognition. In Automatic Face & Gesture Recognition (FG 2017), 2017 12th IEEE International Conference on, pages 118–126. IEEE, 2017.
[10] P. Ekman. Universals and cultural differences in facial expressions of emotions. In Nebraska symposium on motivation. University of Nebraska Press, 1972.
[11] P. Ekman. Methods for measuring facial action. Handbook of methods in nonverbal behavior research, pages 45–90, 1982.
[12] P. Ekman, R. J. Davidson, and W. V. Friesen. The duchenne smile: Emotional expression and brain physiology: Ii. Journal of personality and social psychology, 58(2):342, 1990.
[13] Y. Fan, X. Lu, D. Li, and Y. Liu. Video-based emotion recognition using cnn-rnn and c3d hybrid networks. In Proceedings of the 18th ACM International Conference on Multimodal Interaction, pages 445–450. ACM, 2016.
[14] W. V. Friesen and P. Ekman. Emfacs-7: Emotional facial action coding system. Unpublished manuscript, University of California at San Francisco, 2(36):1, 1983.
[15] M. Ghayoumi and A. K. Bansal. Unifying geometric features and facial action units for improved performance of facial expression analysis. arXiv preprint arXiv:1606.00822, 2016.
[16] P. Gosselin, G. Kirouac, and F. Y. Dor´e. Components and recognition of facial expression in the communication of emotion by actors. Journal of personality and social psychology, 68(1):83, 1995.
[17] R. Gross, I. Matthews, J. Cohn, T. Kanade, and S. Baker. Multi-pie. Image and Vision Computing, 28(5):807–813, 2010.
[18] G. B. Huang, M. Ramesh, T. Berg, and E. Learned-Miller. Labeled faces in the wild: A database for studying face recognition in unconstrained environments. Technical report, Technical Report 07-49, University of Massachusetts, Amherst, 2007.
[19] M. Jaderberg, K. Simonyan, A. Zisserman, and K. Kavukcuoglu. Spatial transformer networks. In Advances in neural information processing systems, pages 2017–2025, 2015.
[20] B.-K. Kim, S.-Y. Dong, J. Roh, G. Kim, and S.-Y. Lee. Fusing aligned and nonaligned face information for automatic affect recognition in the wild: A deep learning approach. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pages 48–57, 2016.
[21] B.-K. Kim, H. Lee, J. Roh, and S.-Y. Lee. Hierarchical committee of deep cnns with exponentially-weighted decision fusion for static facial expression recognition. In Proceedings of the 2015 ACM on International Conference on Multimodal Interaction, pages 427–434. ACM, 2015.
[22] Y. Kim, B. Yoo, Y. Kwak, C. Choi, and J. Kim. Deep generative-contrastive networks for facial expression recognition. arXiv preprint arXiv:1703.07140, 2017.
[23] A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems, pages 1097–1105, 2012.
[24] W. Li, F. Abtahi, and Z. Zhu. Action unit detection with region adaptation, multi-labeling learning and optimal temporal fusing. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 6766–6775. IEEE, 2017.
[25] M. Liu, S. Shan, R. Wang, and X. Chen. Learning expressionlets on spatio-temporal manifold for dynamic facial expression recognition. In Computer Vision and Pattern Recognition (CVPR), 2014 IEEE Conference on, pages 1749–1756. IEEE, 2014.
[26] P. Liu, J. T. Zhou, I. W.-H. Tsang, Z. Meng, S. Han, and Y. Tong. Feature disentangling machine-a novel approach of feature selection and disentangling in facial expression analysis. In European Conference on Computer Vision, pages 151–166. Springer, 2014.
[27] Z. Liu, P. Luo, X. Wang, and X. Tang. Deep learning face attributes in the wild. In Proceedings of the IEEE International Conference on Computer Vision, pages 3730–3738, 2015.
[28] P. Lucey, J. F. Cohn, K. M. Prkachin, P. E. Solomon, and I. Matthews. Painful data: The unbc-mcmaster shoulder pain expression archive database. In Automatic Face & Gesture Recognition and Workshops (FG 2011), 2011 IEEE International Conference on, pages 57–64. IEEE, 2011.
[29] S. M. Mavadati, M. H. Mahoor, K. Bartlett, P. Trinh, and J. F. Cohn. Disfa: A spontaneous facial action intensity database. IEEE Transactions on Affective Computing, 4(2):151–160, 2013.
[30] Z. Meng, P. Liu, J. Cai, S. Han, and Y. Tong. Identity-aware convolutional neural network for facial expression recognition. In Automatic Face & Gesture Recognition (FG 2017), 2017 12th IEEE International Conference on, pages 558–565. IEEE, 2017.
[31] D. Neth and A. M. Martinez. Emotion perception in emotionless face images suggests a norm-based representation. Journal of vision, 9(1):5, 2009.
[32] S. J. Pan and Q. Yang. A survey on transfer learning. IEEE Transactions on knowledge and data engineering, 22(10):1345–1359, 2010.
[33] O. M. Parkhi, A. Vedaldi, and A. Zisserman. Deep face recognition. In BMVC, volume 1, page 6, 2015.
[34] N. Perveen, D. Singh, and C. K. Mohan. Spontaneous facial expression recognition: A part based approach. In Machine Learning and Applications (ICMLA), 2016 15th IEEE International Conference on, pages 819–824. IEEE, 2016.
[35] O. Rudovic, V. Pavlovic, and M. Pantic. Context-sensitive dynamic ordinal regression for intensity estimation of facial action units. IEEE transactions on pattern analysis and machine intelligence, 37(5):944–958, 2015.
[36] A. Ruiz, J. V. de Weijer, and X. Binefa. From emotions to action units with hidden and semi-hidden-task learning. In Computer Vision (ICCV), 2015 IEEE International Conference on, pages 3703–3711. IEEE, 2015.
[37] E. Sangineto, G. Zen, E. Ricci, and N. Sebe. We are not all equal: Personalizing models for facial expression analysis with transductive parameter transfer. In Proceedings of the 22nd ACM international conference on Multimedia, pages 357 366. ACM, 2014.
[38] K. R. Scherer. Handbook of methods in nonverbal behavior research. Cambridge University Press, 1985.
[39] F. Schroff, D. Kalenichenko, and J. Philbin. Facenet: A unified embedding for face recognition and clustering. In Proceedings of the IEEE conference on computer vision and pattern recognition, volume 815-823, 2015.
[40] S. Taheri, Q. Qiu, and R. Chellappa. Structure-preserving sparse decomposition for facial expression analysis. IEEE Transactions on Image Processing, 23(8):3590–3603, 2014.
[41] A. Toshev and C. Szegedy. Deeppose: Human pose estimation via deep neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1653–1660, 2014.
[42] S. Velusamy, H. Kannan, B. Anand, A. Sharma, and B. Navathe. A method to infer emotions from facial action units. In Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on, pages 2028–2031. IEEE, 2011.
[43] F.Wang, X. Xiang, J. Cheng, and A. L. Yuille. Normface: L2 hypersphere embedding for face verification. arXiv preprint arXiv:1704.06369, 2017.
[44] L. Wolf, T. Hassner, and I. Maoz. Face recognition in unconstrained videos with matched background similarity. In Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on, pages 529–534. IEEE, 2011.
[45] X. Wu, R. He, Z. Sun, and T. Tan. A light cnn for deep face representation with noisy labels. arXiv preprint arXiv:1511.02683, 2015.
[46] J. Yan, W. Zheng, Z. Cui, C. Tang, T. Zhang, Y. Zong, and N. Sun. Multi-clue fusion for emotion recognition in the wild. In Proceedings of the 18th ACM International Conference on Multimodal Interaction, pages 458–463. ACM, 2016.
[47] A. Yao, D. Cai, P. Hu, S. Wang, L. Sha, and Y. Chen. Holonet: towards robust emotion recognition in the wild. In Proceedings of the 18th ACM International Conference on Multimodal Interaction, pages 472–478. ACM, 2016.
[48] A. Yao, J. Shao, N. Ma, and Y. Chen. Capturing au-aware facial features and their latent relations for emotion recognition in the wild. In Proceedings of the 2015 ACM on International Conference on Multimodal Interaction, pages 451–458. ACM, 2015.
[49] G. Zen, E. Sangineto, E. Ricci, and N. Sebe. Unsupervised domain adaptation for personalized facial emotion recognition. In Proceedings of the 16th international conference on multimodal interaction, pages 128–135. ACM, 2014.
[50] J. Zeng, W.-S. Chu, F. D. la Torre, J. F. Cohn, and Z. Xiong. Confidence preserving machine for facial action unit detection. In Proceedings of the IEEE International Conference on Computer Vision, pages 3622–3630, 2015.
[51] X. Zhang, L. Yin, J. F. Cohn, S. Canavan, M. Reale, A. Horowitz, P. Liu, and J. M. Girard. Bp4d-spontaneous: a high-resolution spontaneous 3d dynamic facial expression database. Image and Vision Computing, 32(10):692–706, 2014.
[52] K. Zhao, W.-S. Chu, F. D. la Torre, J. F. Cohn, and H. Zhang. Joint patch and multi-label learning for facial action unit and holistic expression recognition. IEEE Transactions on Image Processing, 25(8):3931–3946, 2016.
[53] K. Zhao, W.-S. Chu, and H. Zhang. Deep region and multi-label learning for facial action unit detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 3391–3399, 2016.
[54] L. Zhong, Q. Liu, P. Yang, B. Liu, J. Huang, and D. N. Metaxas. Learning active facial patches for expression analysis. In Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on, pages 2562–2569. IEEE, 2012.
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/69904-
dc.description.abstractFacial action unit detection, which aims to detect facial muscle activities from face images, is an important task to enable the emotion recognition from facial movements. By coding facial muscle activities into a system of facial Action Units (AUs), facial expressions can be clearly described. However, it is still challenging to predict the AUs from fine-grained facial appearances, and one of the challenges lies in handling various appearances from different subjects.
In this thesis, we address the problem by introducing an auxiliary neutral face image to produce person-specific transformations for various subjects. With the help of neutral faces, our method extracts effective features of facial muscle activities despite the divergent individual appearances. We propose to combine an additional face clustering task on top of the AU detection task to form a multi-task network cascades and train the cascades jointly. First, to train the face clustering networks for producing person-specific transformations, we utilize identity-annotated datasets which contain numerous subjects to alleviate a common problem that existing AU-annotated datasets contain only a few subjects. Second, we transform the facial features using the person-specific transformations to reduce individual differences for predicting AU labels. As a result, the proposed network cascades exploit not only the visual but also the identity information and thus more effectively detect AUs based on the personalized appearance normalization.
Our experimental results on the BP4D dataset show that our method outperforms state-of-the-art ones. Experiments under cross-dataset and cross-group scenarios also show the advantage of our method in terms of robustness.
en
dc.description.provenanceMade available in DSpace on 2021-06-17T03:33:20Z (GMT). No. of bitstreams: 1
ntu-107-R04922023-1.pdf: 3989656 bytes, checksum: 496d77a392caa0af3575de72fb408eeb (MD5)
Previous issue date: 2018
en
dc.description.tableofcontentsChapter 1 Introduction 1
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Research Objective . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3 Thesis Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Chapter 2 Related Work 7
2.1 Inductive Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.1.1 Primary Emotion Classification . . . . . . . . . . . . . . . . . 8
2.1.2 Action Unit Detection . . . . . . . . . . . . . . . . . . . . . . 12
2.1.3 Multi-task Learning . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2 Transductive Learning . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.3 Neutral Face Feature Extraction . . . . . . . . . . . . . . . . . . . . . 17
Chapter 3 Problem Statement 19
3.1 Problem Background . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.2 Notations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.3 Problem Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.4 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Chapter 4 Face Clustering and Action Unit Detection Networks 24
4.1 Face Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
4.1.1 Identity Classification . . . . . . . . . . . . . . . . . . . . . . 25
4.1.2 Similarity Learning . . . . . . . . . . . . . . . . . . . . . . . . 27
4.2 Action Unit Detection . . . . . . . . . . . . . . . . . . . . . . . . . . 28
Chapter 5 Multi-task Network Cascades 31
5.1 Framework Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
5.2 Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
5.3 Multi-task Cascaded Architecture . . . . . . . . . . . . . . . . . . . . 38
5.4 Combining with Neutral Faces . . . . . . . . . . . . . . . . . . . . . . 44
Chapter 6 Experiments 48
6.1 Experiment Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
6.2 Network Structures and Hyperparameters . . . . . . . . . . . . . . . 51
6.3 Major Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
6.3.1 Single-dataset Scenarios . . . . . . . . . . . . . . . . . . . . . 54
6.3.2 Cross-dataset Scenarios . . . . . . . . . . . . . . . . . . . . . . 58
6.4 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
Chapter 7 Conclusion 65
7.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
7.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
Bibliography 67
dc.language.isoen
dc.subject機器學習zh_TW
dc.subjectMachine Learningen
dc.title使用多任務網路串聯於個人化臉部動作單元偵測之研究zh_TW
dc.titlePersonalized Facial Action Unit Detection Using Multi-task Network Cascadesen
dc.typeThesis
dc.date.schoolyear106-1
dc.description.degree碩士
dc.contributor.oralexamcommittee廖弘源,傅立成,楊智淵
dc.subject.keyword機器學習,zh_TW
dc.subject.keywordMachine Learning,en
dc.relation.page73
dc.identifier.doi10.6342/NTU201800419
dc.rights.note有償授權
dc.date.accepted2018-02-13
dc.contributor.author-college電機資訊學院zh_TW
dc.contributor.author-dept資訊工程學研究所zh_TW
顯示於系所單位:資訊工程學系

文件中的檔案:
檔案 大小格式 
ntu-107-1.pdf
  未授權公開取用
3.9 MBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved