Skip navigation

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets

Learn More
DSpace logo
English
中文
  • Browse
    • Communities
      & Collections
    • Publication Year
    • Author
    • Title
    • Subject
    • Advisor
  • Search TDR
  • Rights Q&A
    • My Page
    • Receive email
      updates
    • Edit Profile
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 電機工程學系
Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/7539
Title: 使用深度時域對比網絡之人臉情緒辨識
Deep Temporal-Contrastive Network for Facial Expression Recognition
Authors: Zi-Jun Li
黎子駿
Advisor: 傅立成
Keyword: 臉部情緒辨識,卷積神經網絡,對比表達,
Facial Expression Recognition,Convolution Neural Network,Contrastive Representation,
Publication Year : 2018
Degree: 碩士
Abstract: 臉部情緒反映了人類心理活動,因此情緒識別是人機互動的關鍵要素。臉部情緒識別,甚至對於人類來說,也是一個具有挑戰性的任務。這主要是因為每個人都有自己表達情緒的強度和方式。為了從不同的個體裡提取出各種表情的共性,個體的個性造成對情緒判別的影響要盡可能地縮小。
在本論文中,我們提出使用時域對比的深度網絡來實現一個基於視頻的臉部情緒辨識系統。該深度網絡利用時域上的特徵來減少個體個性造成的影響。外表特徵和幾何特徵分別從人臉照片和人臉關鍵點的坐標通過卷積神經網絡(CNN)和深度神經網絡(DNN)提取出來。為了使模型從相鄰幀(情緒類別、強度相似)提取出來的特徵是相似的,我們使用了額外的損失函數。緊接著,我們通過比較視頻幀在高維空間的距離來挑選出一段視頻中最有代表性的兩幀。我們利用那兩幀在高維空間中的對比表達來做情緒分類。
我們使用聯合微調來結合以人臉照片和人臉關鍵點作為輸入的兩個模型。兩個模型相輔相成,使得整個系統得到更好的識別率。
我們在兩個廣泛使用在情緒識別的數據集(CK+和 Oulu-CASIA)進行實驗。實驗結果體現出我們提出的方法能夠有效地提取出關鍵幀,而且在情緒識別準確率上優於現今較好的方法。
Facial expression reflects psychological activities of human and it is key factor in interaction between human and machines. Facial expression recognition is a challenging task even for human since individuals have their own way to express their feelings with different intensity. In order to extract commonality of facial expressions from different individuals, personality effect of individual needs to be minimized as much as possible.
In this thesis, we construct a video-based facial expression recognition system by using a deep temporal-contrastive network(DTCN) that utilizes the temporal feature to remove the personality effect. Appearance and geometry feature are extracted by CNN and DNN from face image and coordinate of facial landmark, respectively. In order to let our CNN framework be able to extract similar features from adjacent frames, special loss function is introduced. Then, the two most representative frames of a video/image sequence are picked out through comparison of distances among frames. Facial expressions can be classified by the so-called contrastive representation between expressions of those two key frames in high dimension space.
We utilize joint fine-tuning to combine two models which take face image and facial landmark as input, respectively. Those two models are complementary and the recognition accuracy is improved by this combination.
We conducted our experiment in the most widely used databases (CK+ and Oulu-CASIA) for facial expression recognition. The experiment results show that the proposed method outperforms those from the state-of-the-art methods.
URI: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/7539
DOI: 10.6342/NTU201802214
Fulltext Rights: 同意授權(全球公開)
metadata.dc.date.embargo-lift: 2023-07-31
Appears in Collections:電機工程學系

Files in This Item:
File SizeFormat 
ntu-107-1.pdf4.12 MBAdobe PDFView/Open
Show full item record


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved