請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/79221
完整後設資料紀錄
DC 欄位 | 值 | 語言 |
---|---|---|
dc.contributor.advisor | 丁肇隆 | zh_TW |
dc.contributor.advisor | Chao-Lung Ting | en |
dc.contributor.author | 林佩宜 | zh_TW |
dc.contributor.author | Pei-Yi Lin | en |
dc.date.accessioned | 2022-11-14T03:46:35Z | - |
dc.date.available | 2023-11-09 | - |
dc.date.copyright | 2022-10-28 | - |
dc.date.issued | 2022 | - |
dc.date.submitted | 2022-10-27 | - |
dc.identifier.citation | [1] C.-Y. Wu, M. Zaheer, H. Hu, R. Manmatha, A. J. Smola, and P. Krähenbühl, "Compressed video action recognition," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 6026-6035.
[2] A. F. Bobick and J. W. Davis, "The recognition of human movement using temporal templates," IEEE Transactions on pattern analysis and machine intelligence, vol. 23, no. 3, pp. 257-267, 2001. [3] K. Simonyan and A. Zisserman, "Two-stream convolutional networks for action recognition in videos," Advances in neural information processing systems, vol. 27, 2014. [4] J.-H. Kim and C. S. Won, "Action recognition in videos using pre-trained 2D convolutional neural networks," IEEE Access, vol. 8, pp. 60179-60188, 2020. [5] M. V. V. Model, "Coding of Moving Pictures and Associated Audio Information," ISO/IEC JTC1/SC29/WG11, 1997. [6] I. M. Committee, "Coding of moving pictures and associated audio for storage at up to about 1.5 mbit/s, part 3: Audio," ISO/IEC 11172, vol. 3, 1993. [7] I. IEC, "Information Technology Generic Coding of Moving Pictures and Associated Audio Information Part 2: Video."," ed, 1994. [8] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, "Gradient-based learning applied to document recognition," Proceedings of the IEEE, vol. 86, no. 11, pp. 2278-2324, 1998. [9] K. Chellapilla, S. Puri, and P. Simard, "High performance convolutional neural networks for document processing," in Tenth international workshop on frontiers in handwriting recognition, 2006: Suvisoft. [10] A. Krizhevsky, I. Sutskever, and G. E. Hinton, "Imagenet classification with deep convolutional neural networks," Communications of the ACM, vol. 60, no. 6, pp. 84-90, 2017. [11] K. Simonyan and A. Zisserman, "Very deep convolutional networks for large-scale image recognition," arXiv preprint arXiv:1409.1556, 2014. [12] C. Szegedy et al., "Going deeper with convolutions," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 1-9. [13] K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770-778. [14] S. Xie, R. Girshick, P. Dollár, Z. Tu, and K. He, "Aggregated residual transformations for deep neural networks," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 1492-1500. [15] A. G. Howard et al., "Mobilenets: Efficient convolutional neural networks for mobile vision applications," arXiv preprint arXiv:1704.04861, 2017. [16] X. Zhang, X. Zhou, M. Lin, and J. Sun, "Shufflenet: An extremely efficient convolutional neural network for mobile devices," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 6848-6856. [17] J. Sánchez, F. Perronnin, T. Mensink, and J. Verbeek, "Image classification with the fisher vector: Theory and practice," International journal of computer vision, vol. 105, no. 3, pp. 222-245, 2013. [18] S. Ji, W. Xu, M. Yang, and K. Yu, "3D convolutional neural networks for human action recognition," IEEE transactions on pattern analysis and machine intelligence, vol. 35, no. 1, pp. 221-231, 2012. [19] D. Tran, L. Bourdev, R. Fergus, L. Torresani, and M. Paluri, "Learning spatiotemporal features with 3d convolutional networks," in Proceedings of the IEEE international conference on computer vision, 2015, pp. 4489-4497. [20] K. Soomro, A. R. Zamir, and M. Shah, "UCF101: A dataset of 101 human actions classes from videos in the wild," arXiv preprint arXiv:1212.0402, 2012. [21] H. Kuehne, H. Jhuang, E. Garrote, T. Poggio, and T. Serre, "HMDB: a large video database for human motion recognition," in 2011 International conference on computer vision, 2011: IEEE, pp. 2556-2563. [22] L. Wang et al., "Temporal segment networks: Towards good practices for deep action recognition," in European conference on computer vision, 2016: Springer, pp. 20-36. [23] H.-S. Fang, S. Xie, Y.-W. Tai, and C. Lu, "Rmpe: Regional multi-person pose estimation," in Proceedings of the IEEE international conference on computer vision, 2017, pp. 2334-2343. [24] J. Canny, "A computational approach to edge detection," IEEE Transactions on pattern analysis and machine intelligence, no. 6, pp. 679-698, 1986. [25] 楊雅筑, "運用影像前處理提升卷積神經網路於人物動作辨識之準確率," 碩士, 工程科學及海洋工程學研究所, 國立臺灣大學, 台北市, 2021. [Online]. Available: https://hdl.handle.net/11296/v6uyx8 [26] S. M. Pizer et al., "Adaptive histogram equalization and its variations," Computer vision, graphics, and image processing, vol. 39, no. 3, pp. 355-368, 1987. | - |
dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/79221 | - |
dc.description.abstract | 近年來,深度學習技術高度發展,以及計算能力的進步,圖像中的物件辨識及分類的功能進步顯著,隨著影片資訊量的成長,對影片中人物進行動作辨識也逐漸受到重視,但是影片中的動作事件必須透過連續的畫面才能正確判斷,而其中重複的畫面包含許多冗餘資訊,因此[1]提出使用壓縮資訊的方式,從壓縮影片中直接提取運動向量及殘差資訊,作為訓練所需之輸入資訊,以減少訓練成本,而本研究基於CoViAR的方法,並加以改進。實驗結果證實,若以更多的動作資訊取代色彩的變化,並將運動歷史圖應用於殘差影像中,能夠提升辨識準確率。另外,將單幀影像及運動向量資訊融合,作為一個模型的輸入資訊,並且將殘差影像的三通道模型縮減為單通道,可以減少模型訓練參數,以更低的計算成本達到相當程度的辨識效果。 | zh_TW |
dc.description.abstract | In recent years, object recognition and classification in images have advanced significantly due to deep learning technology which has been highly developed and the computing power has advanced. Therefore, [1] proposed the method of compressed information to extract motion vectors and residual information directly from compressed videos as input information for training to reduce training costs. This study is based on the method of CoViAR and improved. The experimental results show that the recognition accuracy can be improved if more motion information is used instead of color variation, and the motion history image technology is applied to the residual images. In addition, combining I-frame image and motion vector information as input of one model, and reducing the three-channel model of the residual image to a single channel can reduce the model parameters and achieve a considerable recognition performance with lower computational cost. | en |
dc.description.provenance | Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2022-11-14T03:46:35Z
No. of bitstreams: 0 | en |
dc.description.provenance | Made available in DSpace on 2022-11-14T03:46:35Z (GMT). No. of bitstreams: 0 | en |
dc.description.tableofcontents | 誌謝 i
摘要 ii ABSTRACT iii 目錄 iv 表目錄 vi 圖目錄 vii 第一章 緒論 1 1.1 研究背景與動機 1 1.2 論文架構 2 第二章 文獻回顧 3 2.1 影片取幀方法 3 2.1.1 運動歷史圖(Motion History Image, MHI) 3 2.1.2 堆疊灰階三通道影像(stacked grayscale 3-channel image) 4 2.2 動態影像編碼及運動向量估算 6 2.3 深度學習神經網路 8 2.4 人物動作辨識 12 2.5 UCF動作資料集 16 第三章 研究方法 18 3.1 單幀影像處理方法 18 3.2 運動向量估算 21 3.2.1 邊緣檢測 21 3.2.2 融合單幀影像 24 3.3 殘差影像處理方法 26 3.3.1 影像取幀方法 26 3.3.2 運動歷史圖應用 27 3.4 實驗網路架構及訓練流程 30 3.4.1 卷積神經網路訓練 31 3.4.2 分數融合方法 34 3.4.3 損失函數(Loss Function) 34 第四章 實驗結果與討論 35 4.1 實驗資料集 35 4.2 單幀影像實驗 37 4.3 運動向量實驗 38 4.3.1 邊緣檢測法輸入 39 4.3.2 融合單幀影像 40 4.4 殘差影像實驗 41 4.4.1 不同取幀方式 42 4.4.2 運動歷史圖 43 4.5 分數融合 44 第五章 結論 51 參考文獻 52 | - |
dc.language.iso | zh_TW | - |
dc.title | 應用影像前處理以提升人物動作辨識網路之效能 | zh_TW |
dc.title | Application of Image Preprocessing to Improve Human Action Recognition | en |
dc.title.alternative | Application of Image Preprocessing to Improve Human Action Recognition | - |
dc.type | Thesis | - |
dc.date.schoolyear | 111-1 | - |
dc.description.degree | 碩士 | - |
dc.contributor.oralexamcommittee | 張瑞益;張恆華;黃乾綱 | zh_TW |
dc.contributor.oralexamcommittee | Ray-I Chang;Herng-Hua Chang;Chien-Kang Huang | en |
dc.subject.keyword | 人物動作辨識,卷積神經網路,深度學習,影像處理,壓縮影片格式, | zh_TW |
dc.subject.keyword | human action recognition,convolutional neural network,deep learning,image processing,compressed video, | en |
dc.relation.page | 53 | - |
dc.identifier.doi | 10.6342/NTU202210002 | - |
dc.rights.note | 同意授權(全球公開) | - |
dc.date.accepted | 2022-10-27 | - |
dc.contributor.author-college | 工學院 | - |
dc.contributor.author-dept | 工程科學及海洋工程學系 | - |
dc.date.embargo-lift | 2023-11-09 | - |
顯示於系所單位: | 工程科學及海洋工程學系 |
文件中的檔案:
檔案 | 大小 | 格式 | |
---|---|---|---|
U0001-1028221026495102.pdf | 2.8 MB | Adobe PDF | 檢視/開啟 |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。