請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/96882完整後設資料紀錄
| DC 欄位 | 值 | 語言 |
|---|---|---|
| dc.contributor.advisor | 丁肇隆 | zh_TW |
| dc.contributor.advisor | Chao-Lung Ting | en |
| dc.contributor.author | 黃涵榆 | zh_TW |
| dc.contributor.author | Han-Yu Huang | en |
| dc.date.accessioned | 2025-02-24T16:23:35Z | - |
| dc.date.available | 2025-02-25 | - |
| dc.date.copyright | 2025-02-24 | - |
| dc.date.issued | 2024 | - |
| dc.date.submitted | 2025-01-14 | - |
| dc.identifier.citation | [1] L. Sirovich and M. Kirby, "Low-dimensional procedure for the characterization of human faces," Journal of the Optical Society of America A, vol. 4, no. 3, pp. 519–524, 1987.
[2] M. Turk and A. Pentland, "Eigenfaces for recognition," Journal of Cognitive Neuroscience, vol. 3, no. 1, pp. 71–86, 1991. [3] P. N. Belhumeur, J. P. Hespanha, and D. J. Kriegman, "Eigenfaces vs. Fisherfaces: Recognition using class specific linear projection," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 19, no. 7, pp. 711–720, 1997. [4] E. Osuna, R. Freund, and F. Girosi, "Training support vector machines: An application to face detection," in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), 1997. [5] L. Breiman, "Random forests," Machine Learning, vol. 45, no. 1, pp. 5–32, 2001. [6] S. Gupta, S. Aggarwal, and A. Garg, "Face recognition using random forest," International Journal of Advanced Engineering Sciences and Technologies, vol. 5, no. 1, pp. 121–126, 2010. [7] A. Krizhevsky, I. Sutskever, and G. E. Hinton, "ImageNet classification with deep convolutional neural networks," Advances in Neural Information Processing Systems (NeurIPS), vol. 25, pp. 1097–1105, 2012. [8] K. Simonyan and A. Zisserman, "Very deep convolutional networks for large-scale image recognition," arXiv preprint, arXiv:1409.1556, 2014. [9] K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778, 2016. [10] P. Ekman, Micro expressions. Oxford University Press, 2003. [11] Z. Zhang, P. Luo, C. C. Loy, and X. Tang, "From facial expression recognition to interpersonal relation prediction," International Journal of Computer Vision, vol. 126, no. 5, pp. 550–569, 2018. [12] S. Li and W. Deng, "Deep facial expression recognition: A survey," IEEE Transactions on Affective Computing, 2020. [13] M. Liu, S. Shan, R. Wang, and X. Chen, "Learning expressionlets on spatio-temporal manifold for dynamic facial expression recognition," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1749–1756, 2014. [14] N. Dalal and B. Triggs, "Histograms of oriented gradients for human detection," in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), pp. 886–893, 2005. [15] G. Sandbach, S. Zafeiriou, M. Pantic, and L. Yin, "Static and dynamic 3D facial expression recognition: A comprehensive survey," Image and Vision Computing, vol. 30, no. 10, pp. 683–697, 2012. [16] T. Ojala, M. Pietikäinen, and D. Harwood, "Performance evaluation of texture measures with classification based on Kullback discrimination of distributions," in Proceedings of the 12th IAPR International Conference on Pattern Recognition, pp. 582–585, 1994. [17] C. Shan, S. Gong, and P. W. McOwan, "Facial expression recognition based on local binary patterns: A comprehensive study," Image and Vision Computing, vol. 27, no. 6, pp. 803–816, 2009. [18] A. Mollahosseini, B. Hasani, and M. H. Mahoor, "AffectNet: A database for facial expression, valence, and arousal computing in the wild," IEEE Transactions on Affective Computing, vol. 10, no. 1, pp. 18–31, 2016. [19] A. Jain, Z. Huang, and Y. Chen, "Hybrid deep neural networks for facial expression recognition," in Proceedings of the IEEE International Conference on Image Processing (ICIP), pp. 486–490, 2019. [20] K. Wang, X. Peng, J. Yang, and Y. Tian, "Attention-enhanced cross-modal CNN-LSTM for facial expression recognition," IEEE Transactions on Image Processing, vol. 29, pp. 4109–4121, 2020. [21] H. Song, M. Kim, and J. Lee, "Learning from noisy labels with deep neural networks: A survey," IEEE Transactions on Neural Networks and Learning Systems, vol. 31, no. 4, pp. 1424–1442, 2020. [22] K. Zhang, W. Zuo, S. Gu, and L. Zhang, "Learning deep CNN denoiser prior for image restoration," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3929–3938, 2016. [23] S. Chaudhuri, R. Banerjee, and S. Pal, "Data quality challenges in deep learning-based FER systems," Pattern Recognition Letters, vol. 143, pp. 56–67, 2021. [24] H. Zhao, J. Shi, X. Qi, X. Wang, and J. Jia, "Pyramid scene parsing network," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2881–2890, 2017. [25] H. Yu, L. Wang, and J. Zhang, "Multi-scale feature fusion for facial expression recognition," Pattern Recognition, vol. 100, Art. no. 107112, 2020. [26] Z. Huang, T. Chen, and S. Li, "Self-attention based feature fusion for micro-expression recognition," in Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 4691–4700, 2021. [27] A. K. Roy, H. K. Kathania, A. Sharma, A. Dey, and M. S. A. Ansari, "ResEmoteNet: Bridging accuracy and loss reduction in facial emotion recognition," arXiv preprint, arXiv:2409.10545, 2024. [28] L. Pham, T. H. Vu, and T. A. Tran, "Facial expression recognition using residual masking network," in Proceedings of the 25th International Conference on Pattern Recognition (ICPR), pp. 4513–4519, 2021. [29] P. Kumar, S. Agarwal, and R. Gupta, "Pain detection using facial expression recognition: A review," Journal of Affective Disorders, vol. 292, pp. 66–76, 2021. [30] S. Chaudhuri, R. Banerjee, and S. Pal, "Emotion-aware customer service using FER: Challenges and opportunities," Pattern Recognition Letters, vol. 143, pp. 56–67, 2021. [31] H. Zhou, X. Li, and Y. Wang, "Adaptive learning through emotion recognition: A smart classroom system," Computers & Education, vol. 153, Art. no. 103900, 2020. | - |
| dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/96882 | - |
| dc.description.abstract | 本研究聚焦於面部表情識別(Facial Expression Recognition, FER)技術的發展與應用,旨在解決現有技術中面臨的過擬合、資料品質不足及泛化能力有限等挑戰。FER技術在人工智慧與人機互動領域具有重要價值,但由於資料標註不一致、表情類別不均衡及複雜場景中的表現不足,其準確性與穩健性仍有提升空間。本研究以RAF-DB資料集為基礎,針對資料標註錯誤及品質不佳問題進行多次篩選與處理,使用了多尺寸輸入、特徵融合架構及分段訓練策略等方法,並以AlexNet為核心模型進行優化,結合YOLO檢測技術提升臉部區域的定位精度,從而改善模型的表情辨識能力。在實驗中,經篩選後的資料集顯著提升了模型對細微表情變化的識別能力,驗證了資料品質提升對於模型性能的關鍵作用。多尺寸輸入策略表明影像解析度對學習效果的影響不可忽視;分段訓練策略有效優化了卷積層與全連接層的學習過程;特徵融合架構則透過整合局部與全局特徵,進一步提升了辨識準確率與強健性。本研究為解決FER技術中的資料品質問題提供了切實可行的方案,並透過創新模型設計與學習策略,推動了表情辨識技術的進一步發展。研究成果對教育、醫療、智慧客服等實際應用場景具有重要意義,也為未來FER系統的設計與應用提供了新的參考方向,為該領域的技術創新奠定了堅實基礎。 | zh_TW |
| dc.description.abstract | This study focuses on the development and application of Facial Expression Recognition (FER) technology, aiming to address challenges such as overfitting, insufficient data quality, and limited generalization capabilities in current techniques. FER technology holds significant value in artificial intelligence and human-computer interaction; however, its accuracy and robustness remain constrained due to issues such as inconsistent data annotation, imbalanced expression categories, and suboptimal performance in complex scenarios. Based on the RAF-DB dataset, this research undertakes extensive data filtering and preprocessing to correct annotation errors and enhance data quality. It employs methods such as multi-scale input, feature fusion architectures, and stage-wise training strategies, with AlexNet serving as the core model. Additionally, YOLO detection is integrated to improve the precision of facial region localization, thereby enhancing expression recognition performance.The experimental results demonstrate that the refined dataset significantly improves the model's ability to identify subtle expression variations, affirming the critical role of data quality in enhancing model performance. The multi-scale input strategy highlights the impact of image resolution on learning outcomes; the stage-wise training strategy effectively optimizes the learning process for convolutional and fully connected layers; and the feature fusion architecture combines local and global features, further improving recognition accuracy and robustness.This study provides a practical solution to data quality issues in FER technology and advances expression recognition through innovative model designs and learning strategies. The research findings are highly relevant to applications in education, healthcare, and intelligent customer service, offering new directions for the design and application of FER systems and establishing a solid foundation for technological innovation in this field. | en |
| dc.description.provenance | Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2025-02-24T16:23:35Z No. of bitstreams: 0 | en |
| dc.description.provenance | Made available in DSpace on 2025-02-24T16:23:35Z (GMT). No. of bitstreams: 0 | en |
| dc.description.tableofcontents | 口試委員會審定書 #
誌謝 i 中文摘要 ii ABSTRACT iii 目次 v 圖次 viii 表次 x Chapter 1 緒論 1 1.1 背景動機 1 1.2 研究目的 2 1.3 論文架構 2 Chapter 2 文獻回顧 3 2.1 人臉辨識技術的演進 3 2.1.1 人臉辨識技術的初期研究 3 2.1.2 人臉辨識技術於機器學習的應用 4 2.1.3 深度學習應用於人臉辨識技術 5 2.2 人臉表情辨識的發展 7 2.2.1 從人臉辨識到表情辨識 7 2.2.2 早期表情辨識方法 8 2.2.3 深度學習應用在表情辨識中的研究 9 2.3 資料品質的影響 10 2.4 創新模型架構與進階學習策略 11 2.5 當前FER模型表現 11 2.6 FER的應用 12 Chapter 3 研究方法 14 3.1 資料集與預處理 14 3.1.1 RAF-DB 資料集的介紹 14 3.1.2 RAF-DB資料集的篩選與處理 15 3.1.3 FER2013資料集的介紹 16 3.1.4 資料集的比較 17 3.2 模型架構 18 3.2.1 模型架構設計 18 3.2.2 不同影像尺寸的訓練架構 18 3.2.3 分段訓練策略 19 3.2.4 特徵融合架構 20 3.3 模型訓練過程 21 3.3.1 初始參數設定 21 3.3.2 使用YOLO擷取臉部位置 22 Chapter 4 實驗結果與分析 24 4.1 模型架構選擇 24 4.2 RAF-DB資料集篩選與訓練 25 4.2.1 篩選後 RAF-DB 的模型表現 25 4.2.2 篩選前後的比較 29 4.3 不同模型架構的實驗結果 31 4.3.1 使用不同影像尺寸的訓練結果 31 4.3.2 分段訓練策略的實驗結果 33 4.3.3 特徵融合架構的結果 35 4.3.4 模型架構結果比較 37 4.4 FER2013 資料集的訓練結果 38 4.4.1 FER2013 測試結果分析 38 4.4.2 FER2013與RAF-DB的測試結果比較 46 4.5 YOLO 人臉辨識模型測試結果 47 Chapter 5 結論與未來展望 50 5.1 結果與討論 50 5.2 未來展望 51 REFERENCE 53 | - |
| dc.language.iso | zh_TW | - |
| dc.subject | 特徵融合 | zh_TW |
| dc.subject | 分段訓練 | zh_TW |
| dc.subject | 多尺寸輸入 | zh_TW |
| dc.subject | 深度學習 | zh_TW |
| dc.subject | 面部表情識別 | zh_TW |
| dc.subject | Stage-wise Training | en |
| dc.subject | Facial Expression Recognition (FER) | en |
| dc.subject | Deep Learning | en |
| dc.subject | Multi-Scale Input | en |
| dc.subject | Feature Fusion | en |
| dc.title | 應用機器學習的人臉情緒辨識 | zh_TW |
| dc.title | Facial Emotion Recognition Based on Machine Learning | en |
| dc.type | Thesis | - |
| dc.date.schoolyear | 113-1 | - |
| dc.description.degree | 碩士 | - |
| dc.contributor.oralexamcommittee | 許巍嚴;陳彥廷;張恆華;黃乾綱 | zh_TW |
| dc.contributor.oralexamcommittee | Wei-Yen Hsu;Yen-Ting Chen;Heng-Hua Chang;Chien-Kang Huang | en |
| dc.subject.keyword | 面部表情識別,深度學習,多尺寸輸入,特徵融合,分段訓練, | zh_TW |
| dc.subject.keyword | Facial Expression Recognition (FER),Deep Learning,Multi-Scale Input,Feature Fusion,Stage-wise Training, | en |
| dc.relation.page | 56 | - |
| dc.identifier.doi | 10.6342/NTU202500112 | - |
| dc.rights.note | 同意授權(限校園內公開) | - |
| dc.date.accepted | 2025-01-14 | - |
| dc.contributor.author-college | 工學院 | - |
| dc.contributor.author-dept | 工程科學及海洋工程學系 | - |
| dc.date.embargo-lift | 2030-01-14 | - |
| 顯示於系所單位: | 工程科學及海洋工程學系 | |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| ntu-113-1.pdf 未授權公開取用 | 2.99 MB | Adobe PDF | 檢視/開啟 |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
