利用深層卷積神經網絡之多層次特征偵測臉部關鍵點

Ya-Xu Liu; 劉亞旭

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/60076

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	洪一平
dc.contributor.author	Ya-Xu Liu	en
dc.contributor.author	劉亞旭	zh_TW
dc.date.accessioned	2021-06-16T09:54:32Z	-
dc.date.available	2022-02-08
dc.date.copyright	2017-02-08
dc.date.issued	2016
dc.date.submitted	2017-01-05
dc.identifier.citation	REFERENCE [1] Sun, Yi, Xiaogang Wang, and Xiaoou Tang. 'Deep convolutional network cascade for facial point detection.' Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2013.C. D. Jones, A. B. Smith, and E.F. Roberts, Book Title, Publisher, Location, Date. [2] Liu, Ziwei, et al. 'Deep learning face attributes in the wild.' Proceedings of the IEEE International Conference on Computer Vision. 2015. [3] Zeiler, Matthew D., and Rob Fergus. 'Visualizing and understanding convolutional networks.' European Conference on Computer Vision. Springer International Publishing, 2014. [4] Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. 'Imagenet classification with deep convolutional neural networks.' Advances in neural information processing systems. 2012. [5] Zhang, Zhanpeng, et al. 'Facial landmark detection by deep multi-task learning.' European Conference on Computer Vision. Springer International Publishing, 2014. [6] Xu, Bing, et al. 'Empirical evaluation of rectified activations in convolutional network.' arXiv preprint arXiv:1505.00853 (2015). [7] Burgos-Artizzu, Xavier P., Pietro Perona, and Piotr Dollar. 'Robust face landmark estimation under occlusion.' Proceedings of the IEEE International Conference on Computer Vision. 2013. [8] Cao, Xudong, et al. 'Face alignment by explicit shape regression.' International Journal of Computer Vision 107.2 (2014): 177-190. [9] Cootes, Tim F., et al. 'Robust and accurate shape model fitting using random forest regression voting.' European Conference on Computer Vision. Springer Berlin Heidelberg, 2012. [10] Valstar, Michel, et al. 'Facial point detection using boosted regression and graph models.' Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on. IEEE, 2010. [11] Yan, Junjie, et al. 'Learn to combine multiple hypotheses for accurate face alignment.' Proceedings of the IEEE International Conference on Computer Vision Workshops. 2013. [12] Liang, Lin, et al. 'Face alignment via component-based discriminative search.' European Conference on Computer Vision. Springer Berlin Heidelberg, 2008. [13] Matthews, Iain, and Simon Baker. 'Active appearance models revisited.' International Journal of Computer Vision 60.2 (2004): 135-164. [14] Nair, Vinod, and Geoffrey E. Hinton. 'Rectified linear units improve restricted boltzmann machines.' Proceedings of the 27th International Conference on Machine Learning (ICML-10). 2010. [15] He, Kaiming, et al. 'Delving deep into rectifiers: Surpassing human-level performance on imagenet classification.' Proceedings of the IEEE International Conference on Computer Vision. 2015. [16] Jin, Xiaojie, et al. 'Deep Learning with S-shaped Rectified Linear Activation Units.' arXiv preprint arXiv:1512.07030 (2015). [17] Srivastava, Nitish, et al. 'Dropout: a simple way to prevent neural networks from overfitting.' Journal of Machine Learning Research 15.1 (2014): 1929-1958. [18] Belhumeur, Peter N., et al. 'Localizing parts of faces using a consensus of exemplars.' IEEE transactions on pattern analysis and machine intelligence 35.12 (2013): 2930-2940. [19] Kostinger, Martin, et al. 'Annotated facial landmarks in the wild: A large-scale, real-world database for facial landmark localization.' Computer Vision Workshops (ICCV Workshops), 2011 IEEE International Conference on. IEEE, 2011. [20] Zhu, Xiangxin, and Deva Ramanan. 'Face detection, pose estimation, and landmark localization in the wild.' Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on. IEEE, 2012. [21] Szegedy, Christian, et al. 'Going deeper with convolutions.' Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015. [22] Girshick, Ross, et al. 'Region-based convolutional networks for accurate object detection and segmentation.' IEEE transactions on pattern analysis and machine intelligence 38.1 (2016): 142-158. [23] Girshick, Ross. 'Fast r-cnn.' Proceedings of the IEEE International Conference on Computer Vision. 2015. [24] Ren, Shaoqing, et al. 'Faster R-CNN: Towards real-time object detection with region proposal networks.' Advances in neural information processing systems. 2015. [25] Hariharan, Bharath, et al. 'Hypercolumns for object segmentation and fine-grained localization.' Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015. [26] Exploit All the Layers: Fast and Accurate CNN Object Detector with Scale Dependent Pooling and Cascaded Rejection Classifiers [27] Luxand Incorporated: Luxand face SDK, http://www.luxand.com/ [28] Yu, Xiang, et al. 'Pose-free facial landmark fitting via optimized part mixtures and cascaded deformable shape model.' Proceedings of the IEEE International Conference on Computer Vision. 2013. [29] Xiong, Xuehan, and Fernando De la Torre. 'Supervised descent method and its applications to face alignment.' Proceedings of the IEEE conference on computer vision and pattern recognition. 2013.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/60076	-
dc.description.abstract	因為日常生活中人臉照片中常常出現的大幅度頭部移動、物體遮擋和光照變化，臉部關鍵點的偵測一直是一個富有挑戰而又急需解決的問題。在此論文中，我們嘗試利用深層卷積神經網絡中多層次分佈的特征去協助我們更好地解決這個問題。通過實驗研究，兩層額外的卷積層幫助我們有效地提取出在不同層中的特征圖並將它們有效結合在一起令偵測結果得到提升。除此之外，我們還進一步研究了不同種類激活函數對於模型快速收斂的幫助。通過在不同網絡層中使用不同激活函數，我們不僅加快了模型收斂，還得到更好地實驗結果。在文章最後，我們比較我們的模型與過去知名的方法在不同數據上的表現，並發現我們的模型有足夠的能力超越它們。	zh_TW
dc.description.abstract	Facial landmarks detection has received a significant amount of attention due to its application to face alignment, recognition, verification, and identification. Yet, various factors including the issues of head pose variations, occlusion, and illumination uncertainly made this task challenging. In this study, we exploit hierarchically distributed features which can be learned through deep convolutional neural network (DCNNs). We modify the standard DCNNs via extending two convolutional layers for extracting semantic features in each convolutional layer. We further explore various activation functions with model their convergence efficiency. Experimental results demonstrate that the combination of PReLU and Linear activation performs well in facial landmark detection task. Our approach has also been evaluated on a benchmark dataset and compared with several state-of-the-art methods. The superior results demonstrate its effectiveness and general applicability.	en
dc.description.provenance	Made available in DSpace on 2021-06-16T09:54:32Z (GMT). No. of bitstreams: 1 ntu-105-R03944045-1.pdf: 1935500 bytes, checksum: d07a260f829b43de830011a1007cd1c4 (MD5) Previous issue date: 2016	en
dc.description.tableofcontents	CONTENTS 口試委員會審定書 # 誌謝 i 中文摘要 ii ABSTRACT iii CONTENTS iv LIST OF FIGURES vi LIST OF TABLES vii Chapter 1 Introduction 1 Chapter 2 Related Work 4 2.1 Face Landmark Detection 4 2.2 Face Landmark Detection by DCNN 4 2.3 Hierarchical Features 5 Chapter 3 Deep Convolution Neural Network Model 6 3.1 Main Architecture 7 3.2 Hierarchical Feature Component 8 3.3 Activation Functions 9 3.4 Dropout 11 Chapter 4 Experiments 13 4.1 Dataset 13 4.1.1 Multi-Task Facial Landmark (MTFL) dataset 13 4.1.2 Annotated Face in-the-Wild (AFW) dataset 13 4.1.3 Facial Keypoints Detection Dataset in Kaggle 14 4.2 Implementation Details 14 4.2.1 Preprocessing 14 4.2.2 Data Augmentation 14 4.2.3 Custom objective function (or loss function): 15 4.3 Evaluation methods 15 4.4 Evaluating different activation functions in intermediate layers and the output layer 15 4.5 Comparing different network architectures 17 4.6 Results of our model on the AFLW, AFW and Kaggle dataset 19 Chapter 5 Conclusion and Future Work 22 REFERENCE 23
dc.language.iso	en
dc.subject	深層卷積神經網絡	zh_TW
dc.subject	臉部關鍵點偵測	zh_TW
dc.subject	激活函數	zh_TW
dc.subject	多層次分佈特征	zh_TW
dc.subject	Face Landmark Detection	en
dc.subject	Hierarchically Distributed Feature	en
dc.subject	Deep Convolutional Neural Network	en
dc.subject	Activation Function	en
dc.title	利用深層卷積神經網絡之多層次特征偵測臉部關鍵點	zh_TW
dc.title	Exploit hierarchical distributed features in DCNN for Accurate Face Landmark Detection	en
dc.type	Thesis
dc.date.schoolyear	105-1
dc.description.degree	碩士
dc.contributor.oralexamcommittee	陳冠文,傅楸善,陳祝嵩,徐繼聖
dc.subject.keyword	臉部關鍵點偵測,深層卷積神經網絡,激活函數,多層次分佈特征,	zh_TW
dc.subject.keyword	Deep Convolutional Neural Network,Hierarchically Distributed Feature,Activation Function,Face Landmark Detection,	en
dc.relation.page	26
dc.identifier.doi	10.6342/NTU201700008
dc.rights.note	有償授權
dc.date.accepted	2017-01-06
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	資訊網路與多媒體研究所	zh_TW
顯示於系所單位：	資訊網路與多媒體研究所

文件中的檔案：

檔案	大小	格式
ntu-105-1.pdf 未授權公開取用	1.89 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。