基於結合損失函數及臉部特徵之學習型年齡估測

Min-Chen Hsu; 許銘宸

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/57672

標題:	基於結合損失函數及臉部特徵之學習型年齡估測 Learning Based Age Estimation Using Joint Loss and Facial Landmarks
作者:	Min-Chen Hsu 許銘宸
指導教授:	丁建均(Jian-Jung Ding)
關鍵字:	年齡辨識,深度學習,人臉偵測,機器學習, face detection,age estimation,deep learning,machine learning,
出版年 :	2020
學位:	碩士
摘要:	在科技日益發達的21世紀，物聯網、大數據等科技蓬勃發展，年齡辨識也是可以結合在其中的一項便民科技，不管是結合到家聯網或是警網犯罪監控系統都是很好的應用。不僅如此，年齡辨識的領域應用越來越廣泛,除了上述所提之外，還可以應用到便利商店菸酒的未成年警示以及汽車無照駕駛之提醒。搭配現在的物聯網,年齡辨識可以獲得實務上的應用,例如:保全巡邏,居家照護,以及教育娛樂場所。本論文提出的方法有三個階段，第一階段是資料預處理，我們利用雙重偵測器對照片做人臉偵測，將偵測後的結果進行裁切，接著將裁切過後的人臉進行校正，最後微調人臉的亮度以及對比度。第二階段是我們的深度捲積層，在這一階段中有特徵提取模型、雙重分類模型。首先，將預處理好的照片輸入到我們的模型之中進行特徵抽取，而我們的特徵抽取模型是基於attention機制的Residual Attention Model。特徵提取器會輸出1024維的Embedding 當作下一個分類模型的輸入，雙重分類模型會先將照片分為10類，分別是0~10~20~30~40~50~60~70~80~90~100，接著由分類的結果送到各自的第二分類模型，這裡將會預測出照片與該類平均的差值。最後，藉由該組平均與預測出的差值做計算得到實際預測的年齡。我們採用了IMDB當作訓練集，WIKI當作驗證集，最後在FG-Net及LAP 資料集做測試。由實驗結果可以看到我們所提出的架構有效降低辨識的錯誤率。 In the 21st century where technology is increasingly developed, technologies such as the Internet of Things (IoT) and big data are booming. Age recognition is also a convenient technology that can be integrated into it. Whether it is integrated into a home network or a police network crime monitoring system. The method proposed in this paper has three stages. The first stage is data preprocessing. We use dual detectors to detect faces in the photo, crop the detected results, and then align the cropped faces. Finally, fine-tune the brightness and contrast of the face. The second stage is our deep convolutional layer. In this stage, there are a feature extraction model and a dual classification model. First, input the preprocessed photos into our model for feature extraction, and our feature extraction model is the Residual Attention Model based on the attention mechanism. The feature extractor will output 1024-dimensional embedding with the input image as the input of the next classification model. The dual classification model will first divide the photos into 10 categories, which are 0~10~20~30~40~50~60~70~80~90~100, and then the results of the classification are sent to the respective second classification model, where the difference between the photo and the average of this category will be predicted. Finally, the actual predicted age is calculated by the difference between the average and the predicted value of the group. We used IMDB as the training set, WIKI as the verification set, and finally tested on the FG-Net and LAP datasets. From the experimental results, we can see that the proposed architecture effectively reduces the estimation error rate.
URI:	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/57672
DOI:	10.6342/NTU202001609
全文授權:	有償授權
顯示於系所單位：	電信工程學研究所

文件中的檔案：

檔案	大小	格式
U0001-1707202017142500.pdf 未授權公開取用	2.82 MB	Adobe PDF

顯示文件完整紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料（如：文字、圖片、PDF）並使其易於取用。