請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/15393完整後設資料紀錄
| DC 欄位 | 值 | 語言 |
|---|---|---|
| dc.contributor.advisor | 張智星(Jyh-Shing Jang) | |
| dc.contributor.author | Jun-Qing Liu | en |
| dc.contributor.author | 劉濬慶 | zh_TW |
| dc.date.accessioned | 2021-06-07T17:33:41Z | - |
| dc.date.copyright | 2020-08-21 | |
| dc.date.issued | 2020 | |
| dc.date.submitted | 2020-08-03 | |
| dc.identifier.citation | [1] D. Han, J. Kim, and J. Kim, 'Deep pyramidal residual networks,' in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 5927-5935. [2] K. He, X. Zhang, S. Ren, and J. Sun, 'Deep residual learning for image recognition,' in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770-778. [3] J. Deng, J. Guo, N. Xue, and S. Zafeiriou, 'Arcface: Additive angular margin loss for deep face recognition,' in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019, pp. 4690-4699. [4] K. Zhang, Z. Zhang, Z. Li, and Y. Qiao, 'Joint face detection and alignment using multitask cascaded convolutional networks,' IEEE Signal Processing Letters, vol. 23, no. 10, pp. 1499-1503, 2016. [5] P. Viola and M. Jones, 'Rapid object detection using a boosted cascade of simple features,' in Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001, 8-14 Dec. 2001 2001, vol. 1, pp. I-I. [6] R. Lienhart and J. Maydt, 'An extended set of Haar-like features for rapid object detection,' in Proceedings. International Conference on Image Processing, 22-25 Sept. 2002 2002, vol. 1, pp. I-I, doi: 10.1109/ICIP.2002.1038171. [7] Y. Freund and R. E. Schapire, 'A desicion-theoretic generalization of on-line learning and an application to boosting,' Berlin, Heidelberg, 1995: Springer Berlin Heidelberg, in Computational Learning Theory, pp. 23-37. [8] Q. Liu and G.-z. Peng, 'A robust skin color based face detection algorithm,' in 2010 2nd International Asia Conference on Informatics in Control, Automation and Robotics (CAR 2010), 2010, vol. 2: IEEE, pp. 525-528. [9]R. M. Haralick, S. R. Sternberg, and X. Zhuang, 'Image analysis using mathematical morphology,' IEEE transactions on pattern analysis and machine intelligence, no. 4, pp. 532-550, 1987. [10] S. Ren, K. He, R. Girshick, and J. Sun, 'Faster r-cnn: Towards real-time object detection with region proposal networks,' in Advances in neural information processing systems, 2015, pp. 91-99. [11] H. Jiang and E. Learned-Miller, 'Face detection with the faster R-CNN,' in 2017 12th IEEE International Conference on Automatic Face Gesture Recognition (FG 2017), 2017: IEEE, pp. 650-657. [12] R. Girshick, J. Donahue, T. Darrell, and J. Malik, 'Rich feature hierarchies for accurate object detection and semantic segmentation,' in Proceedings of the IEEE conference on computer vision and pattern recognition, 2014, pp. 580-587. [13] C.-C. Chang and C.-J. Lin, 'LIBSVM: A library for support vector machines,' ACM transactions on intelligent systems and technology (TIST), vol. 2, no. 3, pp. 1-27, 2011. [14] X. Xiong and F. De la Torre, 'Supervised descent method and its applications to face alignment,' in Proceedings of the IEEE conference on computer vision and pattern recognition, 2013, pp. 532-539. [15] X. P. Burgos-Artizzu, P. Perona, and P. Dollár, 'Robust face landmark estimation under occlusion,' in Proceedings of the IEEE international conference on computer vision, 2013, pp. 1513-1520. [16] T. F. Cootes, G. J. Edwards, and C. J. Taylor, 'Active appearance models,' IEEE Transactions on pattern analysis and machine intelligence, vol. 23, no. 6, pp. 681-685, 2001. [17] X. Zhu, X. Liu, Z. Lei, and S. Z. Li, 'Face alignment in full pose range: A 3d total solution,' IEEE transactions on pattern analysis and machine intelligence, vol. 41, no. 1, pp. 78-92, 2017. [18] T. Ahonen, A. Hadid, and M. Pietikainen, 'Face description with local binary patterns: Application to face recognition,' IEEE transactions on pattern analysis and machine intelligence, vol. 28, no. 12, pp. 2037-2041, 2006. [19] P. N. Belhumeur, J. P. Hespanha, and D. J. Kriegman, 'Eigenfaces vs. fisherfaces: Recognition using class specific linear projection,' IEEE Transactions on pattern analysis and machine intelligence, vol. 19, no. 7, pp. 711-720, 1997. [20] W. Liu, Y. Wen, Z. Yu, M. Li, B. Raj, and L. Song, 'Sphereface: Deep hypersphere embedding for face recognition,' in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 212-220. [21] D. G. Lowe, 'Distinctive image features from scale-invariant keypoints,' International journal of computer vision, vol. 60, no. 2, pp. 91-110, 2004. [22] H. Yanbin, Y. Jianqin, and L. Jinping, 'Human face feature extraction and recognition base on SIFT,' in 2008 International Symposium on Computer Science and Computational Technology, 2008, vol. 1: IEEE, pp. 719-722. [23] G. B. Huang, M. Mattar, T. Berg, and E. Learned-Miller, 'Labeled faces in the wild: A database forstudying face recognition in unconstrained environments,' 2008. [24] I. Kemelmacher-Shlizerman, S. M. Seitz, D. Miller, and E. Brossard, 'The megaface benchmark: 1 million faces for recognition at scale,' in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 4873-4882. [25] Y. Taigman, M. Yang, M. A. Ranzato, and L. Wolf, 'Deepface: Closing the gap to human-level performance in face verification,' in Proceedings of the IEEE conference on computer vision and pattern recognition, 2014, pp. 1701-1708. [26] Y. Guo, L. Zhang, Y. Hu, X. He, and J. Gao, 'Ms-celeb-1m: A dataset and benchmark for large-scale face recognition,' in European conference on computer vision, 2016: Springer, pp. 87-102. [27] F. Schroff, D. Kalenichenko, and J. Philbin, 'Facenet: A unified embedding for face recognition and clustering,' in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 815-823. [28] Y. Sun, Y. Chen, X. Wang, and X. Tang, 'Deep learning face representation by joint identification-verification,' in Advances in neural information processing systems, 2014, pp. 1988-1996. [29] H. Wang et al., 'Cosface: Large margin cosine loss for deep face recognition,' in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 5265-5274. [30] I. W. Tsang, J. T. Kwok, C. Bay, and H. Kong, 'Distance metric learning with kernels,' in Proceedings of the International Conference on Artificial Neural Networks, 2003: Citeseer, pp. 126-129. [31] K. Q. Weinberger and L. K. Saul, 'Distance metric learning for large margin nearest neighbor classification,' Journal of Machine Learning Research, vol. 10, no. 2, 2009. [32] S. Chopra, R. Hadsell, and Y. LeCun, 'Learning a similarity metric discriminatively, with application to face verification,' in 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05), 2005, vol. 1: IEEE, pp. 539-546. [33] R. Hadsell, S. Chopra, and Y. LeCun, 'Dimensionality reduction by learning an invariant mapping,' in 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06), 2006, vol. 2: IEEE, pp. 1735-1742. [34] Y. Wen, K. Zhang, Z. Li, and Y. Qiao, 'A discriminative feature learning approach for deep face recognition,' in European conference on computer vision, 2016: Springer, pp. 499-515. [35] W. Liu, Y. Wen, Z. Yu, and M. Yang, 'Large-margin softmax loss for convolutional neural networks,' in ICML, 2016, vol. 2, no. 3, p. 7. [36] Y. Wu, H. Liu, J. Li, and Y. Fu, 'Deep face recognition with center invariant loss,' in Proceedings of the on Thematic Workshops of ACM Multimedia 2017, 2017, pp. 408-414. [37] X. Zhang, Z. Fang, Y. Wen, Z. Li, and Y. Qiao, 'Range loss for deep face recognition with long-tailed training data,' in Proceedings of the IEEE International Conference on Computer Vision, 2017, pp. 5409-5418. [38] R. Ranjan, C. D. Castillo, and R. Chellappa, 'L2-constrained softmax loss for discriminative face verification,' arXiv preprint arXiv:1703.09507, 2017. [39] F. Wang, X. Xiang, J. Cheng, and A. L. Yuille, 'Normface: L2 hypersphere embedding for face verification,' in Proceedings of the 25th ACM international conference on Multimedia, 2017, pp. 1041-1049. [40] F. Wang, J. Cheng, W. Liu, and H. Liu, 'Additive margin softmax for face verification,' IEEE Signal Processing Letters, vol. 25, no. 7, pp. 926-930, 2018. [41] W. Wan, Y. Zhong, T. Li, and J. Chen, 'Rethinking feature distribution for loss functions in image classification,' in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 9117-9126. [42] W. Luo et al., 'Multiple object tracking: A literature review,' arXiv preprint arXiv:1409.7618, 2014. [43] S. Yang, P. Luo, C.-C. Loy, and X. Tang, 'Wider face: A face detection benchmark,' in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 5525-5533. [44] S. Sengupta, J.-C. Chen, C. Castillo, V. M. Patel, R. Chellappa, and D. W. Jacobs, 'Frontal to profile face verification in the wild,' in 2016 IEEE Winter Conference on Applications of Computer Vision (WACV), 2016: IEEE, pp. 1-9. [45] Z. Liu, P. Luo, X. Wang, and X. Tang, 'Large-scale celebfaces attributes (celeba) dataset,' Retrieved August, vol. 15, p. 2018, 2018. [46] Q. Cao, L. Shen, W. Xie, O. M. Parkhi, and A. Zisserman, 'Vggface2: A dataset for recognising faces across pose and age,' in 2018 13th IEEE International Conference on Automatic Face Gesture Recognition (FG 2018), 2018: IEEE, pp. 67-74. [47] J. Deng, J. Guo, Y. Zhou, J. Yu, I. Kotsia, and S. Zafeiriou, 'Retinaface: Single-stage dense face localisation in the wild,' arXiv preprint arXiv:1905.00641, 2019. [48] M. Najibi, P. Samangouei, R. Chellappa, and L. S. Davis, 'Ssh: Single stage headless face detector,' in Proceedings of the IEEE international conference on computer vision, 2017, pp. 4875-4884. [49] T.-Y. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan, and S. Belongie, 'Feature pyramid networks for object detection,' in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 2117-2125. [50] Y. Zhou, J. Deng, I. Kotsia, and S. Zafeiriou, 'Dense 3d face decoding over 2500fps: Joint texture shape convolutional mesh decoders,' in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019, pp. 1097-1106. [51] T.-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollár, 'Focal loss for dense object detection,' in Proceedings of the IEEE international conference on computer vision, 2017, pp. 2980-2988. [52] P. Hu and D. Ramanan, 'Finding tiny faces,' in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 951-959. [53] X. Tang, D. K. Du, Z. He, and J. Liu, 'Pyramidbox: A context-assisted single shot face detector,' in Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 797-813. [54] G. Bradski and A. Kaehler, Learning OpenCV: Computer vision with the OpenCV library. ' O'Reilly Media, Inc.', 2008. [55] S. Chen, Y. Liu, X. Gao, and Z. Han, 'Mobilefacenets: Efficient cnns for accurate real-time face verification on mobile devices,' in Chinese Conference on Biometric Recognition, 2018: Springer, pp. 428-438. [56] C. J. Parde et al., 'Deep convolutional neural network features and the original image,' arXiv preprint arXiv:1611.01751, 2016. [57] K. He, X. Zhang, S. Ren, and J. Sun, 'Delving deep into rectifiers: Surpassing human-level performance on imagenet classification,' in Proceedings of the IEEE international conference on computer vision, 2015, pp. 1026-1034. [58] A. F. Agarap, 'Deep learning using rectified linear units (relu),' arXiv preprint arXiv:1803.08375, 2018. [59] M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L.-C. Chen, 'Mobilenetv2: Inverted residuals and linear bottlenecks,' in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 4510-4520. [60] S. Moschoglou, A. Papaioannou, C. Sagonas, J. Deng, I. Kotsia, and S. Zafeiriou, 'Agedb: the first manually collected, in-the-wild age database,' in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2017, pp. 51-59. [61] J. C. Nascimento, A. J. Abrantes, and J. S. Marques, 'An algorithm for centroid-based tracking of moving objects,' in 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258), 15-19 March 1999 1999, vol. 6, pp. 3305-3308 vol.6. [62] K. Cao, Y. Rong, C. Li, X. Tang, and C. Change Loy, 'Pose-robust face recognition via deep residual equivariant mapping,' in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 5187-5196. [63] Y. Feng, F. Wu, X. Shao, Y. Wang, and X. Zhou, 'Joint 3d face reconstruction and dense alignment with position map regression network,' in Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 534-551. | |
| dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/15393 | - |
| dc.description.abstract | 隨著卷積神經網路的推演,人臉相關技術跟以前相比已有大規模的突破,在人臉偵測以及人臉辨識已經達到商用的水平。本論文主要基於Insightface的人臉偵測以及識別的框架來嘗試一些改善的方法,包含以人臉追蹤來改進人臉識別的辨識率,並使用深度殘差等變映射模型DREAM block實現正臉和側臉之間的特徵向量轉換。首先,在人臉偵測方面,我們挑選了兩種人臉偵測演算法,MTCNN和RetinaFace,在我們的資料集上評估此兩種方法的偵測效能,實驗結果顯示RetinaFace效果比較好。然後我們再進行人臉校正以及各項正規化處理後,基於四種特徵提取模型(LResNet34、50、100E-IR,MobileFaceNet)來抽取特徵向量,最後進行人臉辨識效能比較。我們資料集的場景主要為教室,總共有三種資料集,面臨多角度人臉辨識及追蹤問題,以及坐在教室後排較小人臉辨識問題,透過DRAEM block在MobileFaceNet以及LResNet34E-IR模型上人臉辨識平均提升0.06% (1號資料集提升0.03%、2號資料集提升0.1%、3號資料集提升0.03%)。使用人臉追蹤輔助人臉識別使我們的人臉識別準確度平均提高2.12% (1號資料集提升1.18%、2號資料集提升3.85%、3號資料集提升1.33%)。 | zh_TW |
| dc.description.abstract | Due to the success of CNN, face-related technologies have achieved large-scale breakthroughs recently, and have reached a mature level for commercial applications. Based on Insightface's face detection and recognition framework, this study proposes some improved methods for face recognition, including the use of face-detection-based face tracking for robust face recognition, and the use of DREAM block for feature conversion between front and profile faces. More specifically, we evaluated two face detection algorithms, MTCNN and RetinaFace, based on our classroom video datasets. Once face detection, we perform face alignment and several normalization techniques, and then send the face to four feature extractors, including LResNet34、50、100E-IR,MobileFaceNet. The final step of face recognition is based on the distance/similarity of two face vectors. There are three kinds of datasets in total. We are facing multiple face recognition problems and smaller face recognition problems for students far from the camera. By using DRAEM block, the LResNet34E-IR and MobileFaceNet model for face recognition can achieve an average accuracy improvement of 0.06% (from 80.89% to 80.92% for corpus 1, from 43.52% to 43.62% for corpus 2, and from 53.58% to 53.61% for corpus 3). By using face tracking to enhance face recognition, the average accuracy improvement is 2.12% (from 97.57% to 98.75% for corpus 1, from 63.51% to 67.36% for corpus 2, and from 67.14% to 68.47% for corpus 3). | en |
| dc.description.provenance | Made available in DSpace on 2021-06-07T17:33:41Z (GMT). No. of bitstreams: 1 U0001-1806202018281500.pdf: 9776261 bytes, checksum: fe072b64aba42b6da4c4e25750641236 (MD5) Previous issue date: 2020 | en |
| dc.description.tableofcontents | 口試委員會審定書 i 誌謝 ii 摘要 iii Abstract iv Contents v List of Figures viii List of Tables xi 第1章 緒論 1 第2章 文獻探討 3 2.1 人臉偵測 3 2.2 人臉校準 4 2.3 人臉驗證、識別 5 2.3.1 特徵提取 6 2.3.2 度量學習 7 2.4 目標跟蹤 11 第3章 資料庫簡介 13 3.1 實驗資料集 13 3.2 訓練用資料集 17 3.2.1 WIDER FACE 17 3.2.2 CFPW 17 3.2.3 CelebA 18 3.2.4 VGG2 18 3.2.5 MS-Celeb-1M 19 第4章 方法簡介、實驗設計、實驗設定 20 4.1 人臉偵測 20 4.1.1 MTCNN 20 4.1.2 RetinaFace 21 4.2 人臉校準 23 4.3 人臉特徵提取器 24 4.4 人臉追蹤 28 4.5 深殘差等變映射 30 第5章 實驗結果討論、錯誤分析 33 5.1 人臉偵測效能 33 5.2 人臉辨識效能 37 5.3 頭部姿態估計 40 5.4 加入人臉追蹤改進識別 40 5.5 加入人臉追蹤改進偵測 42 第6章 結論與未來展望 44 6.1 結論 44 6.2 未來展望 44 Bibliography 46 | |
| dc.language.iso | zh-TW | |
| dc.subject | 人臉偵測 | zh_TW |
| dc.subject | 深殘差等變映射 | zh_TW |
| dc.subject | 人臉追蹤 | zh_TW |
| dc.subject | 人臉特徵提取 | zh_TW |
| dc.subject | 人臉辨識 | zh_TW |
| dc.subject | 智慧教室 | zh_TW |
| dc.subject | Face Detection | en |
| dc.subject | Deep Residual Equivariant Mapping | en |
| dc.subject | Face Tracking | en |
| dc.subject | Face Feature Extraction | en |
| dc.subject | Smart Classroom | en |
| dc.subject | Face Recognition | en |
| dc.title | 使用於智慧化教室的人臉偵測、追蹤及辨識 | zh_TW |
| dc.title | Face Detection, Tracking, Recognition for Smart Classrooms | en |
| dc.type | Thesis | |
| dc.date.schoolyear | 108-2 | |
| dc.description.degree | 碩士 | |
| dc.contributor.oralexamcommittee | 傅楸善(Chiou-Shann Fuh),王鈺強(Yu-Chiang Wang) | |
| dc.subject.keyword | 智慧教室,人臉偵測,人臉辨識,人臉特徵提取,人臉追蹤,深殘差等變映射, | zh_TW |
| dc.subject.keyword | Smart Classroom,Face Detection,Face Recognition,Face Feature Extraction,Face Tracking,Deep Residual Equivariant Mapping, | en |
| dc.relation.page | 54 | |
| dc.identifier.doi | 10.6342/NTU202001056 | |
| dc.rights.note | 未授權 | |
| dc.date.accepted | 2020-08-04 | |
| dc.contributor.author-college | 電機資訊學院 | zh_TW |
| dc.contributor.author-dept | 資訊工程學研究所 | zh_TW |
| 顯示於系所單位: | 資訊工程學系 | |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| U0001-1806202018281500.pdf 未授權公開取用 | 9.55 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
