請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/68838
完整後設資料紀錄
DC 欄位 | 值 | 語言 |
---|---|---|
dc.contributor.advisor | 顏嗣鈞(Hsu-Chun Yen) | |
dc.contributor.author | Chen-Wei Yang | en |
dc.contributor.author | 楊鎮維 | zh_TW |
dc.date.accessioned | 2021-06-17T02:37:55Z | - |
dc.date.available | 2020-08-24 | |
dc.date.copyright | 2020-08-24 | |
dc.date.issued | 2020 | |
dc.date.submitted | 2020-08-18 | |
dc.identifier.citation | '1-by-1 convolution.' [Online]. Available:https://pic4.zhimg.com/v2-e52fc0baf42226e2dfa0f8afc5658c49_r.jpg
C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Er-han, V. Vanhoucke, and A. Rabinovich, 'Going Deeper with Convolu-tions,' Proceedings of IEEE Conference on Computer Vision and Pat-tern Recognition (CVPR), pp. 1–9, 2015. 'global average pooling.' [Online]. Available:https://miro.medium.com/max/1538/1*-y6BjnXNHEXaeEdi-eHzvg.png K. Simonyan and A. Zisserman, 'Two-Stream Convolutional Networksfor Action Recognition in Videos,' Proceedings of Neural InformationProcessing Systems (NIPS), 2014. L. Sevilla-Lara, Y. Liao, F. Güney, V. Jampani, A. Geiger, and M. J.Black, 'On the integration of optical flow and action recognition,' inPattern Recognition, T. Brox, A. Bruhn, and M. Fritz, Eds. Cham:Springer International Publishing, 2019, pp. 281–297. (2010)手語介紹—手語及台灣手語介紹. [Online]. Avail-able:http://www.cnad.org.tw/ap/news_view.aspx?bid=25 sn=35ef3d27-4e95-4116-a45d-74e76ece999832 Y.-H. Lee and C.-Y. Tsai, 'Taiwan sign language (TSL) recognition base-don 3D data and neural networks,' Expert Systems with Applications,vol. 36, pp. 1123–1128, 2009. G. C. Lee, F.-H. Yeh, and Y.-H. Hsiao, 'Kinect-based Taiwanesesign-language recognition system,' Multimedia Tools and Applications,vol. 75, pp. 261––279, 2016. Rung-Huei Liang and Ming Ouhyoung, 'A real-time continuous gesturerecognition system for sign language,' in Proceedings Third IEEE Inter-national Conference on Automatic Face and Gesture Recognition, 1998,pp. 558–567. C.-T. Hsieh, H.-C. Liou, and L.-M. Chen, 'Taiwan sign language recogni-tion system using lc-ksvd sparse coding method,' in Proceedings of the2016 6th International Conference on Machinery, Materials, Environ-ment, Biotechnology and Computer. Atlantis Press, 2016/06, pp. 519–523. [Online]. Available:https://doi.org/10.2991/mmebc-16.2016.112 P. Maryam, V. Mansour, and S. Majid, 'Sign Language recognition,'03 2007, pp. 1 – 4. B. K. Horn and B. G. Schunck, 'Determining Optical Flow,' inTechniques and Applications of Image Understanding, J. J. Pearson,Ed., vol. 0281, International Society for Optics and Photonics. SPIE,1981, pp. 319 – 331. [Online]. Available:https://doi.org/10.1117/12.965761 C. Zach, T. Pock, and H. Bischof, 'A duality based approach for realtimetv-l1 optical flow,' in Pattern Recognition, F. A. Hamprecht, C. Schnörr,and B. Jähne, Eds. Berlin, Heidelberg: Springer Berlin Heidelberg,2007, pp. 214–223. T. Amiaz, E. Lubetzky, and N. Kiryati, 'Coarse to Over-Fine OpticalFlow Estimation,' Pattern Recognition, vol. 40, pp. 2496–2503, 2007. A. Krizhevsky, I. Sutskever, and G. E. Hinton, 'ImageNet Classifica-tion with Deep Convolutional Neural Networks,' Proceedings of NeuralInformation Processing Systems (NIPS), 2012. M. Lin, Q. Chen, and S. Yan, 'Network In Network,' arXiv e-prints, p.arXiv:1312.4400, Dec. 2013. J. Carreira and A. Zisserman, 'Quo Vadis, Action Recognition? ANew Model and the Kinetics Dataset,' Proceedings of IEEE Conferenceon Computer Vision and Pattern Recognition (CVPR), pp. 6299–6308,2017. C. Feichtenhofer, A. Pinz, and A. Zisserman, 'Convolutional Two-Stream Network Fusion for Video Action Recognition,' Proceedingsof IEEE Conference on Computer Vision and Pattern Recognition(CVPR), pp. 1933–1941, 2016. L. Wang, Y. Xiong, Z. Wang, Y. Qiao, D. Lin, X. Tang, and L. Van Gool,'Temporal segment networks: Towards good practices for deep actionrecognition,' in Computer Vision – ECCV 2016, B. Leibe, J. Matas,N. Sebe, and M. Welling, Eds. Cham: Springer International Publish-ing, 2016, pp. 20–36. 蔡素娟、戴浩一、陳怡君. (2015)【台灣手語線上辭典】第三版中文版.國立中正大學語言學研究所. [Online]. Available:http://tsl.ccu.edu.tw/web/browser.htm 張榮興. (2011)台灣手語地名電子資料庫.嘉義:國立中正大學語言學研究所. [Online]. Available:http://signlanguage.ccu.edu.tw/placenames.php 林亞秀. (2015)台灣手語拾遺. [Online]. Available:https://www.twsl.cc/ | |
dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/68838 | - |
dc.description.abstract | 在台灣手語辨識的相關研究中,往往都是透過輔助裝置獲得影片中的手部、運動軌跡等資訊並利用傳統特徵擷取算法來實驗與闡述,使得這些研究往往僅能解決在特定情境、特定資料集下的辨識問題,且許多論文並未明確指示出實驗資料集的數量與分布。本論文首先建立了一個針對台灣手語單字影片的資料集,共有4位不同人在不同背景比出常用之63種手語單字,每種單字種類有8至9個影片,共有538支影片;接著透過近年在影像動作辨識上取得良好效果的雙流膨脹卷積網路以泛化的形式解決手語辨識問題,在測試集上達到87.3%的準確率,並透過實驗及錯誤分析給出針對光流和rgb frame作為輸入的往後改進方向。最後將訓練完的模型應用到能即時展示辨識結果的程式中。 | zh_TW |
dc.description.abstract | In the related research on Taiwanese sign language recognition, the hand position and motion trajectory information in the sign language video are often obtained through auxiliary devices, and traditional feature extraction algorithms like SIFT, SURF are used for experiment. These make the related research only applicable to problems for a specific data set or scenario. Furthermore, many papers do not clearly indicate the experimental data set like quantity and distribution. In this thesis, we first established a data set for Taiwanese sign language vocabulary videos. A total of 4 different people gestures featuring the 63 commonly used sign language vocabularies in different backgrounds were considered. Each vocabulary has 8 to 9 videos and a total of 538 videos. Secondly, we apply two-stream dilated convolutional network, which has achieved good results in video motion recognition as reported in the literature, to solve the Taiwanese sign language recognition problem in a generalized form. In the experiment, we reach an accuracy of 87.3% on the test set, and through error analysis, we indicate future improvement on rgb frame and optical flow as inputs for model. Finally, the trained model is applied to a program that can display the recognition results in real time. | en |
dc.description.provenance | Made available in DSpace on 2021-06-17T02:37:55Z (GMT). No. of bitstreams: 1 U0001-1708202004320900.pdf: 5109695 bytes, checksum: 64acc45ceeb82158bb66173f222b9e54 (MD5) Previous issue date: 2020 | en |
dc.description.tableofcontents | 目錄口試委員審定書 i 致謝 ii 中文摘要 iii 英文摘要 iv 圖目錄 viii 表目錄 ix 1緒論 1 1.1背景 1 1.2台灣手語相關研究 2 2神經網路模型與光流簡介 4 2.1光流(Optical Flow) 4 2.1.1假設前提與光流基本方程式 5 2.1.2 Horn Shunck方法 5 2.2卷積神經網路(Convolutional Neural Network, CNN) 6 2.2.1卷積 7 2.2.2最大值池化 8 2.3 Inception網路 9 2.3.1 1x1卷積 9 2.3.2 Inception模組 9 2.3.3降維Inception模組 10 2.3.4訓練細節 12 3影像動作辨識架構 13 3.1卷積神經網路(Convolutional Neural Network, CNN)+長短期記憶模型(Long Short-Term Memory, LSTM) 13 3.2 3D卷積神經網路(3D Convolutional Neural Network, 3DCNN) 14 3.3雙流卷積神經網路(Two-Stream Convolutional Networks) 15 3.3.1光流堆疊 16 3.3.2軌跡堆疊 16 3.3.3雙向光流堆疊 17 3.4雙流3D膨脹卷積神經網路(Two-Stream Inflated 3DCNN) 17 3.4.1將2D預訓練模型拓展為Inflated 3DCNN 17 3.4.2由2D預訓練模型的參數初始化Inflated 3DCNN 18 3.4.3調整特徵的學習範圍域 18 3.4.4加入光流作為輸入 19 3.5光流的效用探討 19 4台灣手語單字辨識 21 4.1手語資料集 21 4.1.1資料來源 21 4.1.2資料分析 21 4.2單字辨識流程 22 4.2.1資料前處理 22 4.2.2模型架構 23 5實驗結果與分析 25 5.1不同模型輸入、取樣與資料集數量比較 25 5.1.1對影片不同之取樣數量 25 5.1.2 RGB vs光流輸入 26 5.1.3不同資料集數量比較 27 5.2錯誤討論與改良 29 6結論與未來發展 31 6.1結論 31 6.2未來發展 31 參考文獻 32 | |
dc.language.iso | zh-TW | |
dc.title | 雙流膨脹卷積網路之台灣手語單字辨識 | zh_TW |
dc.title | Two-Stream Inflated 3DCNN for Taiwanese Sign Language Recognition | en |
dc.type | Thesis | |
dc.date.schoolyear | 108-2 | |
dc.description.degree | 碩士 | |
dc.contributor.oralexamcommittee | 雷欽隆(Chin-Laung Lei),郭斯彥(Sy-Yen Kuo) | |
dc.subject.keyword | 膨脹卷積神經網路,光流,影片分類,深度學習,台灣手語單字資料集, | zh_TW |
dc.subject.keyword | Inflated Convolution Neural Network,Optical Flow,Video Classification,Deep Learning,Taiwanese Sign Language Word Dataset, | en |
dc.relation.page | 34 | |
dc.identifier.doi | 10.6342/NTU202003667 | |
dc.rights.note | 有償授權 | |
dc.date.accepted | 2020-08-19 | |
dc.contributor.author-college | 電機資訊學院 | zh_TW |
dc.contributor.author-dept | 電機工程學研究所 | zh_TW |
顯示於系所單位: | 電機工程學系 |
文件中的檔案:
檔案 | 大小 | 格式 | |
---|---|---|---|
U0001-1708202004320900.pdf 目前未授權公開取用 | 4.99 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。