Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊工程學系
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/15393
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor張智星(Jyh-Shing Jang)
dc.contributor.authorJun-Qing Liuen
dc.contributor.author劉濬慶zh_TW
dc.date.accessioned2021-06-07T17:33:41Z-
dc.date.copyright2020-08-21
dc.date.issued2020
dc.date.submitted2020-08-03
dc.identifier.citation[1] D. Han, J. Kim, and J. Kim, 'Deep pyramidal residual networks,' in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 5927-5935.
[2] K. He, X. Zhang, S. Ren, and J. Sun, 'Deep residual learning for image recognition,' in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770-778.
[3] J. Deng, J. Guo, N. Xue, and S. Zafeiriou, 'Arcface: Additive angular margin loss for deep face recognition,' in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019, pp. 4690-4699.
[4] K. Zhang, Z. Zhang, Z. Li, and Y. Qiao, 'Joint face detection and alignment using multitask cascaded convolutional networks,' IEEE Signal Processing Letters, vol. 23, no. 10, pp. 1499-1503, 2016.
[5] P. Viola and M. Jones, 'Rapid object detection using a boosted cascade of simple features,' in Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001, 8-14 Dec. 2001 2001, vol. 1, pp. I-I.
[6] R. Lienhart and J. Maydt, 'An extended set of Haar-like features for rapid object detection,' in Proceedings. International Conference on Image Processing, 22-25 Sept. 2002 2002, vol. 1, pp. I-I, doi: 10.1109/ICIP.2002.1038171.
[7] Y. Freund and R. E. Schapire, 'A desicion-theoretic generalization of on-line learning and an application to boosting,' Berlin, Heidelberg, 1995: Springer Berlin Heidelberg, in Computational Learning Theory, pp. 23-37.
[8] Q. Liu and G.-z. Peng, 'A robust skin color based face detection algorithm,' in 2010 2nd International Asia Conference on Informatics in Control, Automation and Robotics (CAR 2010), 2010, vol. 2: IEEE, pp. 525-528.
[9]R. M. Haralick, S. R. Sternberg, and X. Zhuang, 'Image analysis using mathematical morphology,' IEEE transactions on pattern analysis and machine intelligence, no. 4, pp. 532-550, 1987.
[10] S. Ren, K. He, R. Girshick, and J. Sun, 'Faster r-cnn: Towards real-time object detection with region proposal networks,' in Advances in neural information processing systems, 2015, pp. 91-99.
[11] H. Jiang and E. Learned-Miller, 'Face detection with the faster R-CNN,' in 2017 12th IEEE International Conference on Automatic Face Gesture Recognition (FG 2017), 2017: IEEE, pp. 650-657.
[12] R. Girshick, J. Donahue, T. Darrell, and J. Malik, 'Rich feature hierarchies for accurate object detection and semantic segmentation,' in Proceedings of the IEEE conference on computer vision and pattern recognition, 2014, pp. 580-587.
[13] C.-C. Chang and C.-J. Lin, 'LIBSVM: A library for support vector machines,' ACM transactions on intelligent systems and technology (TIST), vol. 2, no. 3, pp. 1-27, 2011.
[14] X. Xiong and F. De la Torre, 'Supervised descent method and its applications to face alignment,' in Proceedings of the IEEE conference on computer vision and pattern recognition, 2013, pp. 532-539.
[15] X. P. Burgos-Artizzu, P. Perona, and P. Dollár, 'Robust face landmark estimation under occlusion,' in Proceedings of the IEEE international conference on computer vision, 2013, pp. 1513-1520.
[16] T. F. Cootes, G. J. Edwards, and C. J. Taylor, 'Active appearance models,' IEEE Transactions on pattern analysis and machine intelligence, vol. 23, no. 6, pp. 681-685, 2001.
[17] X. Zhu, X. Liu, Z. Lei, and S. Z. Li, 'Face alignment in full pose range: A 3d total solution,' IEEE transactions on pattern analysis and machine intelligence, vol. 41, no. 1, pp. 78-92, 2017.
[18] T. Ahonen, A. Hadid, and M. Pietikainen, 'Face description with local binary patterns: Application to face recognition,' IEEE transactions on pattern analysis and machine intelligence, vol. 28, no. 12, pp. 2037-2041, 2006.
[19] P. N. Belhumeur, J. P. Hespanha, and D. J. Kriegman, 'Eigenfaces vs. fisherfaces: Recognition using class specific linear projection,' IEEE Transactions on pattern analysis and machine intelligence, vol. 19, no. 7, pp. 711-720, 1997.
[20] W. Liu, Y. Wen, Z. Yu, M. Li, B. Raj, and L. Song, 'Sphereface: Deep hypersphere embedding for face recognition,' in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 212-220.
[21] D. G. Lowe, 'Distinctive image features from scale-invariant keypoints,' International journal of computer vision, vol. 60, no. 2, pp. 91-110, 2004.
[22] H. Yanbin, Y. Jianqin, and L. Jinping, 'Human face feature extraction and recognition base on SIFT,' in 2008 International Symposium on Computer Science and Computational Technology, 2008, vol. 1: IEEE, pp. 719-722.
[23] G. B. Huang, M. Mattar, T. Berg, and E. Learned-Miller, 'Labeled faces in the wild: A database forstudying face recognition in unconstrained environments,' 2008.
[24] I. Kemelmacher-Shlizerman, S. M. Seitz, D. Miller, and E. Brossard, 'The megaface benchmark: 1 million faces for recognition at scale,' in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 4873-4882.
[25] Y. Taigman, M. Yang, M. A. Ranzato, and L. Wolf, 'Deepface: Closing the gap to human-level performance in face verification,' in Proceedings of the IEEE conference on computer vision and pattern recognition, 2014, pp. 1701-1708.
[26] Y. Guo, L. Zhang, Y. Hu, X. He, and J. Gao, 'Ms-celeb-1m: A dataset and benchmark for large-scale face recognition,' in European conference on computer vision, 2016: Springer, pp. 87-102.
[27] F. Schroff, D. Kalenichenko, and J. Philbin, 'Facenet: A unified embedding for face recognition and clustering,' in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 815-823.
[28] Y. Sun, Y. Chen, X. Wang, and X. Tang, 'Deep learning face representation by joint identification-verification,' in Advances in neural information processing systems, 2014, pp. 1988-1996.
[29] H. Wang et al., 'Cosface: Large margin cosine loss for deep face recognition,' in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 5265-5274.
[30] I. W. Tsang, J. T. Kwok, C. Bay, and H. Kong, 'Distance metric learning with kernels,' in Proceedings of the International Conference on Artificial Neural Networks, 2003: Citeseer, pp. 126-129.
[31] K. Q. Weinberger and L. K. Saul, 'Distance metric learning for large margin nearest neighbor classification,' Journal of Machine Learning Research, vol. 10, no. 2, 2009.
[32] S. Chopra, R. Hadsell, and Y. LeCun, 'Learning a similarity metric discriminatively, with application to face verification,' in 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05), 2005, vol. 1: IEEE, pp. 539-546.
[33] R. Hadsell, S. Chopra, and Y. LeCun, 'Dimensionality reduction by learning an invariant mapping,' in 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06), 2006, vol. 2: IEEE, pp. 1735-1742.
[34] Y. Wen, K. Zhang, Z. Li, and Y. Qiao, 'A discriminative feature learning approach for deep face recognition,' in European conference on computer vision, 2016: Springer, pp. 499-515.
[35] W. Liu, Y. Wen, Z. Yu, and M. Yang, 'Large-margin softmax loss for convolutional neural networks,' in ICML, 2016, vol. 2, no. 3, p. 7.
[36] Y. Wu, H. Liu, J. Li, and Y. Fu, 'Deep face recognition with center invariant loss,' in Proceedings of the on Thematic Workshops of ACM Multimedia 2017, 2017, pp. 408-414.
[37] X. Zhang, Z. Fang, Y. Wen, Z. Li, and Y. Qiao, 'Range loss for deep face recognition with long-tailed training data,' in Proceedings of the IEEE International Conference on Computer Vision, 2017, pp. 5409-5418.
[38] R. Ranjan, C. D. Castillo, and R. Chellappa, 'L2-constrained softmax loss for discriminative face verification,' arXiv preprint arXiv:1703.09507, 2017.
[39] F. Wang, X. Xiang, J. Cheng, and A. L. Yuille, 'Normface: L2 hypersphere embedding for face verification,' in Proceedings of the 25th ACM international conference on Multimedia, 2017, pp. 1041-1049.
[40] F. Wang, J. Cheng, W. Liu, and H. Liu, 'Additive margin softmax for face verification,' IEEE Signal Processing Letters, vol. 25, no. 7, pp. 926-930, 2018.
[41] W. Wan, Y. Zhong, T. Li, and J. Chen, 'Rethinking feature distribution for loss functions in image classification,' in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 9117-9126.
[42] W. Luo et al., 'Multiple object tracking: A literature review,' arXiv preprint arXiv:1409.7618, 2014.
[43] S. Yang, P. Luo, C.-C. Loy, and X. Tang, 'Wider face: A face detection benchmark,' in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 5525-5533.
[44] S. Sengupta, J.-C. Chen, C. Castillo, V. M. Patel, R. Chellappa, and D. W. Jacobs, 'Frontal to profile face verification in the wild,' in 2016 IEEE Winter Conference on Applications of Computer Vision (WACV), 2016: IEEE, pp. 1-9.
[45] Z. Liu, P. Luo, X. Wang, and X. Tang, 'Large-scale celebfaces attributes (celeba) dataset,' Retrieved August, vol. 15, p. 2018, 2018.
[46] Q. Cao, L. Shen, W. Xie, O. M. Parkhi, and A. Zisserman, 'Vggface2: A dataset for recognising faces across pose and age,' in 2018 13th IEEE International Conference on Automatic Face Gesture Recognition (FG 2018), 2018: IEEE, pp. 67-74.
[47] J. Deng, J. Guo, Y. Zhou, J. Yu, I. Kotsia, and S. Zafeiriou, 'Retinaface: Single-stage dense face localisation in the wild,' arXiv preprint arXiv:1905.00641, 2019.
[48] M. Najibi, P. Samangouei, R. Chellappa, and L. S. Davis, 'Ssh: Single stage headless face detector,' in Proceedings of the IEEE international conference on computer vision, 2017, pp. 4875-4884.
[49] T.-Y. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan, and S. Belongie, 'Feature pyramid networks for object detection,' in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 2117-2125.
[50] Y. Zhou, J. Deng, I. Kotsia, and S. Zafeiriou, 'Dense 3d face decoding over 2500fps: Joint texture shape convolutional mesh decoders,' in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019, pp. 1097-1106.
[51] T.-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollár, 'Focal loss for dense object detection,' in Proceedings of the IEEE international conference on computer vision, 2017, pp. 2980-2988.
[52] P. Hu and D. Ramanan, 'Finding tiny faces,' in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 951-959.
[53] X. Tang, D. K. Du, Z. He, and J. Liu, 'Pyramidbox: A context-assisted single shot face detector,' in Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 797-813.
[54] G. Bradski and A. Kaehler, Learning OpenCV: Computer vision with the OpenCV library. ' O'Reilly Media, Inc.', 2008.
[55] S. Chen, Y. Liu, X. Gao, and Z. Han, 'Mobilefacenets: Efficient cnns for accurate real-time face verification on mobile devices,' in Chinese Conference on Biometric Recognition, 2018: Springer, pp. 428-438.
[56] C. J. Parde et al., 'Deep convolutional neural network features and the original image,' arXiv preprint arXiv:1611.01751, 2016.
[57] K. He, X. Zhang, S. Ren, and J. Sun, 'Delving deep into rectifiers: Surpassing human-level performance on imagenet classification,' in Proceedings of the IEEE international conference on computer vision, 2015, pp. 1026-1034.
[58] A. F. Agarap, 'Deep learning using rectified linear units (relu),' arXiv preprint arXiv:1803.08375, 2018.
[59] M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L.-C. Chen, 'Mobilenetv2: Inverted residuals and linear bottlenecks,' in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 4510-4520.
[60] S. Moschoglou, A. Papaioannou, C. Sagonas, J. Deng, I. Kotsia, and S. Zafeiriou, 'Agedb: the first manually collected, in-the-wild age database,' in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2017, pp. 51-59.
[61] J. C. Nascimento, A. J. Abrantes, and J. S. Marques, 'An algorithm for centroid-based tracking of moving objects,' in 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258), 15-19 March 1999 1999, vol. 6, pp. 3305-3308 vol.6.
[62] K. Cao, Y. Rong, C. Li, X. Tang, and C. Change Loy, 'Pose-robust face recognition via deep residual equivariant mapping,' in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 5187-5196.
[63] Y. Feng, F. Wu, X. Shao, Y. Wang, and X. Zhou, 'Joint 3d face reconstruction and dense alignment with position map regression network,' in Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 534-551.
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/15393-
dc.description.abstract隨著卷積神經網路的推演,人臉相關技術跟以前相比已有大規模的突破,在人臉偵測以及人臉辨識已經達到商用的水平。本論文主要基於Insightface的人臉偵測以及識別的框架來嘗試一些改善的方法,包含以人臉追蹤來改進人臉識別的辨識率,並使用深度殘差等變映射模型DREAM block實現正臉和側臉之間的特徵向量轉換。首先,在人臉偵測方面,我們挑選了兩種人臉偵測演算法,MTCNN和RetinaFace,在我們的資料集上評估此兩種方法的偵測效能,實驗結果顯示RetinaFace效果比較好。然後我們再進行人臉校正以及各項正規化處理後,基於四種特徵提取模型(LResNet34、50、100E-IR,MobileFaceNet)來抽取特徵向量,最後進行人臉辨識效能比較。我們資料集的場景主要為教室,總共有三種資料集,面臨多角度人臉辨識及追蹤問題,以及坐在教室後排較小人臉辨識問題,透過DRAEM block在MobileFaceNet以及LResNet34E-IR模型上人臉辨識平均提升0.06% (1號資料集提升0.03%、2號資料集提升0.1%、3號資料集提升0.03%)。使用人臉追蹤輔助人臉識別使我們的人臉識別準確度平均提高2.12% (1號資料集提升1.18%、2號資料集提升3.85%、3號資料集提升1.33%)。zh_TW
dc.description.abstractDue to the success of CNN, face-related technologies have achieved large-scale breakthroughs recently, and have reached a mature level for commercial applications. Based on Insightface's face detection and recognition framework, this study proposes some improved methods for face recognition, including the use of face-detection-based face tracking for robust face recognition, and the use of DREAM block for feature conversion between front and profile faces. More specifically, we evaluated two face detection algorithms, MTCNN and RetinaFace, based on our classroom video datasets. Once face detection, we perform face alignment and several normalization techniques, and then send the face to four feature extractors, including LResNet34、50、100E-IR,MobileFaceNet. The final step of face recognition is based on the distance/similarity of two face vectors. There are three kinds of datasets in total. We are facing multiple face recognition problems and smaller face recognition problems for students far from the camera. By using DRAEM block, the LResNet34E-IR and MobileFaceNet model for face recognition can achieve an average accuracy improvement of 0.06% (from 80.89% to 80.92% for corpus 1, from 43.52% to 43.62% for corpus 2, and from 53.58% to 53.61% for corpus 3). By using face tracking to enhance face recognition, the average accuracy improvement is 2.12% (from 97.57% to 98.75% for corpus 1, from 63.51% to 67.36% for corpus 2, and from 67.14% to 68.47% for corpus 3).en
dc.description.provenanceMade available in DSpace on 2021-06-07T17:33:41Z (GMT). No. of bitstreams: 1
U0001-1806202018281500.pdf: 9776261 bytes, checksum: fe072b64aba42b6da4c4e25750641236 (MD5)
Previous issue date: 2020
en
dc.description.tableofcontents口試委員會審定書 i
誌謝 ii
摘要 iii
Abstract iv
Contents v
List of Figures viii
List of Tables xi
第1章 緒論 1
第2章 文獻探討 3
2.1 人臉偵測 3
2.2 人臉校準 4
2.3 人臉驗證、識別 5
2.3.1 特徵提取 6
2.3.2 度量學習 7
2.4 目標跟蹤 11
第3章 資料庫簡介 13
3.1 實驗資料集 13
3.2 訓練用資料集 17
3.2.1 WIDER FACE 17
3.2.2 CFPW 17
3.2.3 CelebA 18
3.2.4 VGG2 18
3.2.5 MS-Celeb-1M 19
第4章 方法簡介、實驗設計、實驗設定 20
4.1 人臉偵測 20
4.1.1 MTCNN 20
4.1.2 RetinaFace 21
4.2 人臉校準 23
4.3 人臉特徵提取器 24
4.4 人臉追蹤 28
4.5 深殘差等變映射 30
第5章 實驗結果討論、錯誤分析 33
5.1 人臉偵測效能 33
5.2 人臉辨識效能 37
5.3 頭部姿態估計 40
5.4 加入人臉追蹤改進識別 40
5.5 加入人臉追蹤改進偵測 42
第6章 結論與未來展望 44
6.1 結論 44
6.2 未來展望 44
Bibliography 46
dc.language.isozh-TW
dc.subject人臉偵測zh_TW
dc.subject深殘差等變映射zh_TW
dc.subject人臉追蹤zh_TW
dc.subject人臉特徵提取zh_TW
dc.subject人臉辨識zh_TW
dc.subject智慧教室zh_TW
dc.subjectFace Detectionen
dc.subjectDeep Residual Equivariant Mappingen
dc.subjectFace Trackingen
dc.subjectFace Feature Extractionen
dc.subjectSmart Classroomen
dc.subjectFace Recognitionen
dc.title使用於智慧化教室的人臉偵測、追蹤及辨識zh_TW
dc.titleFace Detection, Tracking, Recognition for Smart Classroomsen
dc.typeThesis
dc.date.schoolyear108-2
dc.description.degree碩士
dc.contributor.oralexamcommittee傅楸善(Chiou-Shann Fuh),王鈺強(Yu-Chiang Wang)
dc.subject.keyword智慧教室,人臉偵測,人臉辨識,人臉特徵提取,人臉追蹤,深殘差等變映射,zh_TW
dc.subject.keywordSmart Classroom,Face Detection,Face Recognition,Face Feature Extraction,Face Tracking,Deep Residual Equivariant Mapping,en
dc.relation.page54
dc.identifier.doi10.6342/NTU202001056
dc.rights.note未授權
dc.date.accepted2020-08-04
dc.contributor.author-college電機資訊學院zh_TW
dc.contributor.author-dept資訊工程學研究所zh_TW
顯示於系所單位:資訊工程學系

文件中的檔案:
檔案 大小格式 
U0001-1806202018281500.pdf
  未授權公開取用
9.55 MBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved