請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/89860
完整後設資料紀錄
DC 欄位 | 值 | 語言 |
---|---|---|
dc.contributor.advisor | 馬劍清 | zh_TW |
dc.contributor.advisor | Chien-Ching Ma | en |
dc.contributor.author | 陳彥霖 | zh_TW |
dc.contributor.author | Yen-Lin Chen | en |
dc.date.accessioned | 2023-09-22T16:25:59Z | - |
dc.date.available | 2023-11-09 | - |
dc.date.copyright | 2023-09-22 | - |
dc.date.issued | 2023 | - |
dc.date.submitted | 2023-08-10 | - |
dc.identifier.citation | Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, Illia Polosukhin, “Attention is all you need.” Advances in neural information processing systems, vol. 30, 2017.
Alec Radford, Karthik Narasimhan, Tim Salimans, Ilya Sutskever "Improving language understanding by generative pre-training.",2018. Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, Ilya Sutskever "Language models are unsupervised multitask learners." OpenAI blog 1.8, 9,2019. Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel Ziegler, Jeffrey Wu, Clemens Winter, Chris Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, Dario Amodei "Language models are few-shot learners." Advances in neural information processing systems 33 1877-1901,2020. OpenAI, “GPT-4 Technical Report”. arXiv eprint arXiv:2303.08774 ,2023. Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, Neil Houlsby "An image is worth 16x16 words: Transformers for image recognition at scale." arXiv preprint arXiv:2010.11929 ,2020. Ho, Jonathan, Ajay Jain, and Pieter Abbeel. "Denoising diffusion probabilistic models." Advances in neural information processing systems 33 6840-6851,2020. W. H. Peters and W. F. Ranson. “Digital imaging techniques in experimental stress analysis.” Optical Engineering, vol. 21, no. 3, pp. 427-431, 1982. M. A. Sutton, W. J. Wolters, W. H. Peters, W. F. Ranson, and S.R. McNeill, “Determination of displacements using an improved digital correlation method.” Image and Vision Computing, vol. 1, no. 3, pp. 133-139, 1983. W. H. Peters, W. F. Ranson, M. A. Sutton, T. C. Chu, and J. Anderson, “ Application of digital correlation methods to rigid body mechanics,” Optical Engineering, vol. 22, no. 6, pp. 738-742, 1983. H. Lu and P. D. Cary, “Deformation measurements by digital image correlation: implementation of a second-order displacement gradient.” Experimental Mechanics vol. 40, no. 4, pp. 393-400, 2000. M. A. Sutton, C. Mingqi, W. H. Peters, Y. J. Chao, and S. R. McNeill, “Application of an optimized digital correlation method to planar deformation analysis,” Image and Vision Computing, vol. 4, no. 3, pp. 143-150, 1986. H. A. Bruck, S. R. McNeill, M. A. Sutton, and W. H. Peters III, “Digital image correlation using Newton-Raphson method of partial differential correction.” Experimental Mechanics, vol. 29, no. 3, pp. 261-267, 1989. G. Vendroux and W. G. Knauss, “Submicron deformation field measurements: Part 2. Improved digital image correlation.” Experimental Mechanics, vol. 38, no. 2, pp. 86-92, 1998. S. Baker and I. Matthews, “Equivalence and efficiency of image alignment algorithms.” 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), 2001, Kauai, HI, USA. S. Baker and I. Matthews, “Lucas-kanade 20 years on: A unifying framework.” International Journal of Computer Vision, vol. 56, no. 3, pp. 221-255, 2004. B. Pan, K. Li and W. Tong, “Fast, robust and accurate digital image correlation calculation without redundant computations.” Experimental Mechanics, vol. 53, no. 7, pp. 1277-1289, 2013. B. Pan, “An evaluation of convergence criteria for digital image correlation using inverse compositional Gauss–Newton algorithm.” Strain, vol. 50, no. 1, pp. 48-56, 2014. Y. Gao, T. Cheng, Y. Su, X. Xu, Y. Zhang, and Q. Zhang, “High-efficiency and high-accuracy digital image correlation for three-dimensional measurement.” Optics and Lasers in Engineering, vol. 65, pp. 73-80, 2015. Z. L. Kahn-Jetter and T. C. Chu, “Three-dimensional displacement measurements using digital image correlation and photogrammic analysis.” Experimental Mechanics, vol. 30, no. 1, pp. 10-16, 1990. P. F. Luo, Y. J. Chao, M. A. Sutton, and W. H. Peters III, “Accurate measurement of three-dimensional deformations in deformable and rigid bodies using computer vision.” Experimental Mechanics, vol. 33, no. 2, pp. 123-132, 1993. Z. Zhang, “A flexible new technique for camera calibration.” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no. 11, pp. 1330-1334, 2000. 張景媖,馬劍清,「數位影像相關法應用於跨尺度跨領域靜態及動態全域位移與應變精密量測」,碩士論文,機械工程學研究所,臺灣大學,2013。 周宛萱,馬劍清,「建構高精度數位影像相關法並應用於土木結構動態系統及奈米材料微系統的變形量測」,碩士論文,機械工程學研究所,臺灣大學,2014。 簡宸煜,馬劍清,「應用數位影像相關法於土木結構及碳纖維性質與電池表面變化之量測」,碩士論文,機械工程學研究所,臺灣大學,2015。 彭柏勳,馬劍清,「應用數位影像相關法於機械系統與土木結構之變形及動態特性量測」,碩士論文,機械工程學研究所,臺灣大學,2016。 陳亮至,馬劍清,「建構立體數位影像相關法之基礎理論並應用於結構靜態與動態三維變形精密量測」,碩士論文,機械工程學研究所,臺灣大學,2016。 王盛儀,馬劍清,「數位影像相關法於二維軌跡及變形量測和應用於建構立體形貌」,碩士論文,機械工程學研究所,臺灣大學,2018。 黃右年,馬劍清,「建立即時立體數位影像相關法於三維工程問題的動態量測」, 碩士論文,機械工程學研究所,臺灣大學,2018。 毛英澤,馬劍清,「數位影像相關法於高速主軸即時監測與機械系統動態行為及車輛追跡之跨領域量測」,碩士論文,機械工程學研究所,臺灣大學,2019。 李宇倫,馬劍清,「提升數位影像相關法的量測精度並應用於車輛追蹤與機械手臂的三維量測」,碩士論文,機械工程學研究所,臺灣大學,2020。 陳義翔,馬劍清,「優化數位影像相關法並應用於跨尺度問題的精密量測」,碩士論文,機械工程學研究所,臺灣大學,2020。 吳俊賢,馬劍清,「建立數位結構光量測系統並應用於三維形貌與變形量測和機械手臂手眼校正與取放任務」,碩士論文,機械工程學研究所,臺灣大學,2020。 李霽儒,馬劍清「提升數位影像相關法效能並應用於跨尺度動態問題量測與機械手臂之系統整合」,碩士論文,機械工程學研究所,臺灣大學,2021。 謝佳軒,馬劍清「數位影像相關法於精密量測與人機共工系統的整合應用」,碩士論文,機械工程學研究所,臺灣大學,2022。 B. Pan, K. Li and W. Tong, “Fast, robust and accurate digital image correlation calculation without redundant computations.” Experimental Mechanics, vol. 53, no. 7, pp. 1277-1289, 2013. Mitchell, Tom Michael. Machine learning. Vol. 1. New York: McGraw-hill, 2007. Glorot, Xavier, Antoine Bordes, and Yoshua Bengio. "Deep sparse rectifier neural networks." Proceedings of the fourteenth international conference on artificial intelligence and statistics. JMLR Workshop and Conference Proceedings, 2011. Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, Ruslan Salakhutdinov "Dropout: a simple way to prevent neural networks from overfitting." The journal of machine learning research 15.1 1929-1958,2014. Y. Lecun; L. Bottou; Y. Bengio; P. Haffner "Gradient-based learning applied to document recognition." Proceedings of the IEEE 86.11 (1998): 2278-2324. Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet classification with deep convolutional neural networks." Communications of the ACM 60.6 (2017): 84-90. Zhe Cao, Tomas Simon, Shih-En Wei, Yaser Sheikh "Realtime multi-person 2d pose estimation using part affinity fields." Proceedings of the IEEE conference on computer vision and pattern recognition. 2017. Camillo Lugaresi, Jiuqiang Tang, Hadon Nash, Chris McClanahan, Esha Uboweja, Michael Hays, Fan Zhang, Chuo-Ling Chang, Ming Guang Yong, Juhyun Lee, Wan-Teh Chang, Wei Hua, Manfred Georg, Matthias Grundmann "Mediapipe: A framework for building perception pipelines." arXiv preprint arXiv:1906.08172 ,2019. Valentin Bazarevsky, Ivan Grishchenko, Karthik Raveendran, Tyler Zhu, Fan Zhang, Matthias Grundmann "Blazepose: On-device real-time body pose tracking." arXiv preprint arXiv:2006.10204 ,2020. Valentin Bazarevsky, Yury Kartynnik, Andrey Vakunov, Karthik Raveendran, Matthias Grundmann "Blazeface: Sub-millisecond neural face detection on mobile gpus." arXiv preprint arXiv:1907.05047 ,2019. Ivan Grishchenko, Artsiom Ablavatski, Yury Kartynnik, Karthik Raveendran, Matthias Grundmann "High-fidelity face mesh prediction in real-time." arXiv preprint arXiv:2006.10962 ,2020. Fan Zhang, Valentin Bazarevsky, Andrey Vakunov, Andrei Tkachenka, George Sung, Chuo-Ling Chang, Matthias Grundmann "Mediapipe hands: On-device real-time hand tracking." arXiv preprint arXiv:2006.10214 ,2020. Alejandro Newell, Kaiyu Yang, and Jia Deng. Stacked hourglass networks for human pose estimation. In European conference on computer vision, pages 483–499. Springer, 2016. Zhe Cao, Tomas Simon, Shih-En Wei, Yaser Sheikh "Realtime multi-person 2d pose estimation using part affinity fields." Proceedings of the IEEE conference on computer vision and pattern recognition. 2017. Sarah Mroz; Natalie Baddour; Connor McGuirk; Pascale Juneau; Albert Tu; Kevin Cheung; Edward Lemaire "Comparing the quality of human pose estimation with blazepose or openpose." 2021 4th International Conference on Bio-Engineering for Smart Technologies (BioSMART). IEEE, 2021. W FORSTNER. A fast operator for detection and precise location of distincs points, corners and center of circular features. In Proc. of the Intercommission Conference on Fast Processing of Photogrammetric Data, Interlaken, Switzerland, 1987, pages 281–305, 1987. Richard Hartley and Andrew Zisserman. Multiple view geometry in computer vision. Cambridge university press, 2003. Anders Grunnet-Jepsen, John N. Sweetser, Paul Winer, Akihiro Takagi, John Woodfill, "Projectors for Intel® RealSense™ Depth Cameras D4xx" Maćkiewicz, Andrzej, and Waldemar Ratajczak. "Principal components analysis (PCA)." Computers & Geosciences 19.3 ,1993. Laurens van der Maaten, Geoffrey Hinton "Visualizing Data using t-SNE",Journal of Machine Learning Research 9,2008 Gastal, Eduardo SL, and Manuel M. Oliveira. "Domain transform for edge-aware image and video processing." ACM SIGGRAPH 2011 papers. 1-12,2011. Geoffrey E. Hinton, Nitish Srivastava, Alex Krizhevsky, Ilya Sutskever, Ruslan R. Salakhutdinov "Improving neural networks by preventing co-adaptation of feature detectors." arXiv preprint arXiv:1207.0580 ,2012. Joseph Redmon, Santosh Divvala, Ross Girshick, Ali Farhadi "You only look once: Unified, real-time object detection." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016. Wei Liu,Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu & Alexander C. Berg "Ssd: Single shot multibox detector." Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14. Springer International Publishing, 2016. Kaiming He; Georgia Gkioxari; Piotr Dollár; Ross Girshick "Mask r-cnn." Proceedings of the IEEE international conference on computer vision. 2017. Girshick, Ross. "Fast r-cnn." Proceedings of the IEEE international conference on computer vision. 2015. Shaoqing Ren, Kaiming He, Ross Girshick, Jian Sun "Faster r-cnn: Towards real-time object detection with region proposal networks." Advances in neural information processing systems 28 ,2015. Redmon, Joseph, and Ali Farhadi."YOLO9000: better, faster, stronger." Proceedings of the IEEE conference on computer vision and pattern recognition. 2017. Redmon, Joseph, and Ali Farhadi. "Yolov3: An incremental improvement." arXiv preprint arXiv:1804.02767 ,2018. Bochkovskiy, Alexey, Chien-Yao Wang, and Hong-Yuan Mark Liao. "Yolov4: Optimal speed and accuracy of object detection." arXiv preprint arXiv:2004.10934 ,2020. Chien-Yao Wang, Hong-Yuan Mark Liao, Yueh-Hua Wu, Ping-Yang Chen, Jun-Wei Hsieh, I-Hau Yeh "CSPNet: A new backbone that can enhance learning capability of CNN." Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops. 2020. Kaiming He; Xiangyu Zhang; Shaoqing Ren; Jian Sun "Spatial pyramid pooling in deep convolutional networks for visual recognition." IEEE transactions on pattern analysis and machine intelligence 37.9,1904-1916,2015. Shu Liu, Lu Qi, Haifang Qin, Jianping Shi, Jiaya Jia "Path aggregation network for instance segmentation." Proceedings of the IEEE conference on computer vision and pattern recognition. 2018. Tsung-Yi Lin, Piotr Dollar, Ross Girshick, Kaiming He, Bharath Hariharan, Serge Belongie "Feature pyramid networks for object detection." Proceedings of the IEEE conference on computer vision and pattern recognition. 2017. Zhuliang Yao, Yue Cao, Shuxin Zheng, Gao Huang, Stephen Lin "Cross-iteration batch normalization." Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2021. Jocher, Glenn ; Chaurasia, Ayush ; Stoken, Alex ; Borovec, Jirka ; NanoCode012 ; Kwon, Yonghye ; Michael, Kalen ; TaoXie ; Fang, Jiacong ; Imyhxy ; Lorna ; 曾逸夫, Zeng Yifu ; Wong, Colin ; V, Abhiram ; Montes, Diego ; Wang, Zhiqiang ; Fati, Cristi ; Nadar, Jebastin ; Laughing ; UnglvKitDe ; Sonck, Victor ; Tkianai ; YxNONG ; Skalski, Piotr ; Hogan, Adam ; Nair, Dhruv ; Strobel, Max ; Jain, Mrinal " ultralytics/yolov5:v7.0 YOLOv5 SOTA Realtime Instance Segmentation " Zenodo ,2022. Otsu, Nobuyuki. "A threshold selection method from gray-level histograms." IEEE transactions on systems, man, and cybernetics 9.1 62-66,1979 Satoshi Suzuki and KeiichiA be Topological structural analysis of digitized binary images by border following. Computer Vision, Graphics, and Image Processing, 30(1):32–46, 1985. | - |
dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/89860 | - |
dc.description.abstract | 本論文整合數位影像相關法光學量測技術(Digital image correlation, DIC)與深度學習應用於工業問題。數位影像相關法為非接觸式、非破壞全場域、跨尺度的量測技術,以完備的數學理論,計算影像數值追蹤表面特徵,取得特徵的位移、速度、加速度、縮放、變形等資訊,具有高精度、高靈敏的特性。深度學習則以統計、機率領域為基礎,使用數據訓練模型,進而調整參數,以實現任務目標,可歸納複雜問題,泛用性高。藉由結合兩種技術的優勢,開發立體人體姿態檢測與辨識系統、手勢控制機械手臂以及多物件即時追蹤量測。
在立體人體姿態檢測與辨識系統中,首先使用了MediaPipe BlazePose檢測人體姿態特徵點,接著搭配雙相機立體數位影像相關法,成功獲取立體人體姿態特徵精確座標,並與彩色深度相機(RGB-D Camera)量測結果進行詳細討論與比較,呈現雙相機建構的三維座標的穩定性,並收集良好量測的姿態數據進行深度學習模型的訓練,在靜態與動態動作中,有效辨識出人體動作進行分類,能夠應用在許多領域之中。 手勢控制機械手臂利用自行訓練的手勢辨識模型,建立客製化的手勢指令,讓人機操作更為直覺,提高機械手臂控制的效率。最後利用立體數位影像相關法量測機械手臂運行性能,呈現手勢控制機械手臂的效能。 多物件即時追蹤量測使用YOLOv5模型辨識多物件類別,搭配RGB-D相機量測。提出在可視化深度影像中進行影像處理,提取物件形心與重心等資訊,有效提升追蹤的穩定性,使得追蹤不受物件光源、顏色影響量測精度。 | zh_TW |
dc.description.abstract | This paper integrates digital image correlation (DIC), a technique used for optical measurement, and deep learning for industrial applications. Digital image correlation is a non-contact, non-destructive, full-field, and multiscale measurement technique that employs comprehensive mathematical theories to compute the displacements, velocities, accelerations, scaling, and deformations of surface features from images, offering high precision and sensitivity. Deep learning, on the other hand, is based on statistics and probability and utilizes data to train models and adjust parameters to achieve specific tasks. It can effectively handle complex problems and exhibits high generality. By combining the advantages of these two technologies, a system for stereoscopic human posture detection and recognition, gesture-controlled robotic arms, and real-time multi-object tracking and measurement are developed.
In the stereoscopic human posture detection and recognition system, MediaPipe BlazePose is first used to detect human pose landmarks, which are then combined with stereo digital image correlation using dual cameras to accurately obtain the coordinates of stereoscopic human pose features. Detailed discussions and comparisons with measurements from a color-depth camera (RGB-D Camera) demonstrate the stability of the 3D coordinates constructed using dual cameras. Good-quality pose data is collected for training deep learning models, enabling effective recognition and classification of human movements in both static and dynamic actions. This system has potential applications in various fields. Gesture-controlled robotic arms utilize self-trained gesture recognition models to establish customized gesture commands, making human-machine interaction more intuitive and improving the efficiency of robotic arm control. Finally, the performance of the gesture-controlled robotic arm is measured using stereoscopic digital image correlation, showcasing the effectiveness of gesture control. For real-time multi-object tracking and measurement, the YOLOv5 model is employed to recognize multiple object classes, combined with RGB-D camera measurements. An image processing technique is proposed to extract object centroids and centers of gravity from visualized depth images, effectively enhancing tracking stability and mitigating the impact of object lighting and color on measurement accuracy. | en |
dc.description.provenance | Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2023-09-22T16:25:59Z No. of bitstreams: 0 | en |
dc.description.provenance | Made available in DSpace on 2023-09-22T16:25:59Z (GMT). No. of bitstreams: 0 | en |
dc.description.tableofcontents | 摘要 I
Abstract III 目錄 V 表目錄 VIII 圖目錄 IX 第一章 緒論 1 1.1 研究動機 1 1.2 文獻回顧 3 1.3 內容簡介 4 第二章 數位影像相關法原理以及深度學習模型理論與實驗儀器 7 2.1 數位影像相關法原理介紹 7 2.1.1 數位影像相關法量測流程與特色 7 2.1.2 數位影像相關法相關參數 10 2.1.3 數位影像相關法搜尋演算法 14 2.1.4 二維及三維數位影像相關法原理 27 2.2 深度學習模型介紹 32 2.2.1 深度學習發展脈絡 32 2.2.2 深度學習單元架構 34 2.2.3 卷積神經網絡 38 2.2.4 Transformer模型 39 2.3 實驗儀器介紹 43 2.3.1 數位工業相機 43 2.3.2 數位工業相機鏡頭 43 2.3.3 六軸機械手臂 44 2.3.4 電動夾爪 44 2.3.5 RealSense D435i 44 2.3.6 RealSense D455 45 第三章 立體人體姿態建構與應用 77 3.1 人體姿態檢測系統開發 77 3.1.1 人體姿態檢測模型 78 3.1.2 雙相機立體人體姿態建構 82 3.1.3 RGB-D相機應用於立體人體姿態建構 86 3.1.4 雙相機與RGB-D相機人體姿態實驗比較 91 3.2 人體姿態辨識建構與應用 92 3.2.1 人體姿態資料標籤 93 3.2.2 人體姿態辨識模型建構 95 3.2.3 人體姿態辨識應用於居家監測 95 3.2.4 人體姿態辨識應用於運動分析 97 第四章 手勢辨識建構機械手臂控制系統 127 4.1 手勢辨識系統建構 127 4.1.1 手勢姿態資料建構 128 4.1.2 手勢辨識模型建構 129 4.2 手勢控制機械手臂系統開發 130 4.2.1 手勢指令控制機械手臂 131 4.2.2 數位影像相關法於RGB-D相機追蹤 132 4.2.3 人體手臂座標與機械手臂座標關係建構 133 4.2.4 傳輸控制協定建構 134 4.2.5 手勢控制機械手臂量測 135 第五章 多物件辨識量測 175 5.1 多物件辨識量測系統 175 5.1.1 多物件即時辨識系統 176 5.1.2 多人姿態檢測 179 5.1.3 RGB-D相機應用於物件位置量測 181 第六章 結論與未來展望 195 6.1 結論 195 6.2 未來展望 196 參考文獻 199 表2 - 1 CCD與CMOS感光元件特性比較 46 表2 - 2 交叉相關係數公式 46 表2 - 3 總平方差相關係數公式 47 表2 - 4 CPU與GPU特性比較 47 表2 - 5 損失函數公式 48 表2 - 6 工業相機規格表 48 表2 - 7工業相機鏡頭規格表 49 表2 - 8 六軸機械手臂規格表 50 表2 - 9 機械手臂控制器規格表 51 表2 - 10 電動夾爪規格表 52 表2 - 11 RealSense D435i規格表 53 表2 - 12 RealSense D455規格表 54 表3 - 1 機器學習種類比較 99 表 4 - 1控制機械手臂指令表(手勢控制模式) 139 表 4 - 2控制機械手臂指令表(追蹤同步模式) 140 表 5 - 1 YOLOv5各模型表現數據列表[71] 185 圖2 - 1影像色深比較示意圖 55 圖2 - 2常見數位影像相關法特徵:(a) 噴漆斑點特徵 (b) 手寫文字特徵 (c) 網版印刷特徵 (d) QR-code特徵 55 圖2 - 3棋盤格校正板 56 圖2 - 4樣板子集合示意圖 56 圖2 - 5搜尋子集合示意圖 57 圖2 - 6形狀函數(Shape function):剛體平移(Zero Order) 57 圖2 - 7形狀函數(Shape function):剛體旋轉(Zero Order) 58 圖2 - 8 形狀函數(Shape function):一階變形(First Order) 58 圖2 - 9形狀函數(Shape function):二階變形(Second Order) 59 圖2 - 10數位影像相關法搜尋示意 59 圖2 - 11 (a)相關係數場 (b)CCPF區域坐標系 60 圖2 - 12 牛頓拉福森法流程圖 60 圖2 - 13 正向疊加牛頓拉福森法流程圖 61 圖2 - 14 反向合成高斯牛頓法流程圖 62 圖2 - 15 理想針孔成像模型圖 63 圖2 - 16 針孔成像模型(a)俯視圖(b)側視圖 63 圖2 - 17 對極幾何示意圖 64 圖2 - 18單相機模型投影圖 64 圖2 - 19 單相機投影座標轉換 65 圖2 - 20 雙相機模型投影圖 65 圖2 - 21 雙相機座標轉換關係圖 66 圖2 - 22影像徑向畸變種類 66 圖2 - 23 切向畸變成因示意 67 圖2 - 24 人工智慧、機器學習、深度學習發展關係圖 67 圖2 - 25類神經網絡模型示意 68 圖2 - 26激活函數:Binary step、tanh、sigmoid、ReLU 68 圖2 - 27 卷積運算示意 69 圖2 - 28 Transformer模型架構[1] 69 圖2 - 29 自注意力機制架構圖(a)Self-Attention (b)Multi-head Self-Attention 70 圖2 - 30工業相機外觀示意 70 圖2 - 31 工業相機CAD圖 71 圖2 - 32工業相機鏡頭外觀與CAD圖 71 圖2 - 33六軸機械手臂外觀 72 圖2 - 34 機械手臂控制器外觀 72 圖2 - 35 六軸機械手臂CAD圖 73 圖2 - 36電動夾爪外觀 74 圖2 - 37 電動夾爪CAD圖 74 圖2 - 38 RealSense D435i外觀 75 圖2 - 39 RealSense D455外觀 75 圖3 - 1 MediaPipe人臉辨識注意力機制架構[46] 100 圖3 - 2人體預測特徵點[44] 100 圖3 - 3 BlazePose模型架構[44] 101 圖3 - 4手掌特徵點[47] 101 圖3 - 5 OpenPose特徵點[42] 102 圖3 - 6 OpenPose模型架構[42] 102 圖3 - 7次像素尋角點方法[51] 103 圖3 - 8 Realsense 過濾器應用效果呈現 103 圖3 - 9人體姿態檢測實驗:跑步 104 圖3 - 10跑步鼻子( )X空間座標 105 圖3 - 11跑步鼻子( )Y空間座標 106 圖3 - 12跑步鼻子( )Z空間座標 106 圖3 - 13跑步鼻子X方向位移比較 106 圖3 - 14跑步鼻子Y方向位移比較 107 圖3 - 15跑步左手腕( )X方向位移 107 圖3 - 16跑步左手腕( )Y方向位移 108 圖3 - 17跑步左手腕( )Z方向位移 109 圖3 - 18 跑步3到4秒示意圖 109 圖3 - 19跑步4到6秒示意圖 109 圖3 - 20跑步右腳膝蓋( )X方向位移 110 圖3 - 21跑步右腳膝蓋( )Y方向位移 111 圖3 - 22跑步右腳膝蓋( )Z方向位移 111 圖3 - 23跑步左手臂夾腳變化 112 圖3 - 24開合跳鼻子( )Y空間座標 113 圖3 - 25深蹲鼻子( )Y方向位移 113 圖3 - 26 深蹲右腳夾角變化 114 圖3 - 27伏地挺身左手臂夾角變化 114 圖3 - 28動態人體辨識數據標籤建構方式 115 圖3 - 29靜態人體姿態辨識模型結構 116 圖3 - 30靜態動作損失值變化 116 圖3 - 31 t-SNE靜態模型輸出可視化(對應輸出標籤) 117 圖3 - 32 t-SNE靜態模型輸出可視化(對應數據標籤) 118 圖3 - 33居家靜態辨識(舉手) 119 圖3 - 34居家靜態辨識(坐姿) 119 圖3 - 35居家靜態辨識(站姿) 120 圖3 - 36居家靜態辨識(躺姿) 120 圖3 - 37動態人體姿態辨識模型結構 121 圖3 - 38動態動作損失值變化 121 圖3 - 39 t-SNE動態模型輸出可視化(對應輸出標籤) 122 圖3 - 40 t-SNE動態模型輸出可視化(對應數據標籤) 123 圖3 - 41動態運動辨識(跑步) 124 圖3 - 42動態運動辨識(開合跳) 124 圖3 - 43動態運動辨識(深蹲) 125 圖3 - 44動態運動辨識(伏地挺身) 125 圖 4 - 1手勢標籤 141 圖 4 - 2 Dropout演算示意 142 圖 4 - 3手勢辨識模型結構 142 圖 4 - 4手勢辨識模型訓練損失值變化 143 圖 4 - 5 t-SNE手勢辨識模型輸出可視化(對應輸出標籤) 144 圖 4 - 6 t-SNE手勢辨識模型輸出可視化(對應數據標籤) 145 圖 4 - 7 手勢控制機械手臂系統運作流程圖 146 圖 4 - 8模式選擇的手勢 147 圖 4 - 9選擇模式一的手勢 147 圖 4 - 10選擇模式二的手勢 148 圖 4 - 11選擇模式三的手勢 148 圖 4 - 12控制機械手臂停止的手勢 149 圖 4 - 13控制機械手臂回歸原點的手勢 149 圖 4 - 14控制機械手臂進行向右移動的手勢 149 圖 4 - 15制機械手臂進行向左移動的手勢 150 圖 4 - 16控制機械手臂進行向上移動的手勢 150 圖 4 - 17控制機械手臂進行向下移動的手勢 151 圖 4 - 18控制機械手臂進行向前移動的手勢 151 圖 4 - 19控制機械手臂進行向後移動的手勢 152 圖 4 - 20控制機械手臂第六軸末端效應器順時針旋轉的手勢 152 圖 4 - 21控制機械手臂第六軸末端效應器逆時針旋轉的手勢 153 圖 4 - 22控制機械手臂夾爪進行夾取的手勢 153 圖 4 - 23控制機械手臂鬆開夾爪的手勢 154 圖 4 - 24手勢控制機械手臂模式二流程 155 圖 4 - 25 RGB-D相機即時追蹤手掌深度資訊 156 圖 4 - 26定義手掌座標與手臂座標關係的手勢 156 圖 4 - 27機械手臂與手掌同步移動的手勢 157 圖 4 - 28 TCP傳輸控制協定流程 157 圖 4 - 29量測實驗架設圖 158 圖 4 - 30進行目標夾取任務 158 圖 4 - 31模式一X方向座標量測與指令比對 159 圖 4 - 32模式一X方向座標量測與指令比對(0秒到10秒) 159 圖 4 - 33模式一X方向座標量測與指令比對(25秒到35秒) 160 圖 4 - 34模式一Y方向座標量測與指令比對 160 圖 4 - 35模式一Y方向座標量測與指令比對(25到32秒) 161 圖 4 - 36 特徵變形與縮放(26秒到28秒) 161 圖 4 - 37模式一Z方向座標量測與指令比對 162 圖 4 - 38模式一Z方向座標量測與指令比對(45秒到55秒) 162 圖 4 - 39特徵變形與縮放(48秒到50秒) 163 圖 4 - 40模式一Y方向座標量測修正與指令比對 163 圖 4 - 41模式一Z方向座標量測修正與指令比對 164 圖 4 - 42模式二X方向座標量測 164 圖 4 - 43模式二X方向座標追蹤指令 165 圖 4 - 44模式二Y方向座標量測 165 圖 4 - 45模式二Y方向座標追蹤指令 166 圖 4 - 46模式二Z方向座標量測 166 圖 4 - 47模式二Z方向座標追蹤指令 167 圖 4 - 48 模式二X方向座標追蹤指令(6到24秒) 167 圖 4 - 49 模式二X方向座標追蹤指令(26到41秒) 168 圖 4 - 50 模式二X方向座標追蹤指令(54到65秒) 168 圖 4 - 51 模式二X方向座標追蹤指令(67到80秒) 169 圖 4 - 52 模式二Y方向座標追蹤指令(6到24秒) 169 圖 4 - 53 模式二Y方向座標追蹤指令(26到41秒) 170 圖 4 - 54 模式二Y方向座標追蹤指令(54到65秒) 170 圖 4 - 55 模式二Y方向座標追蹤指令(67到80秒) 171 圖 4 - 56 模式二Z方向座標追蹤指令(6到24秒) 171 圖 4 - 57 模式二Z方向座標追蹤指令(26到41秒) 172 圖 4 - 58 模式二Z方向座標追蹤指令(54到65秒) 172 圖 4 - 59 模式二Z方向座標追蹤指令(67到80秒) 173 圖 5 - 1常見物件辨識架構[65] 186 圖 5 - 2 YOLO模型辨識示意圖[58] 186 圖 5 - 3 YOLOv3邊界框演算預測[64] 187 圖 5 - 4 Mish激活函數 187 圖 5 - 5 YOLOv4模型設計參考[66][67][68] 188 圖 5 - 6 YOLOv5辨識結果 188 圖 5 - 7多人檢測系統實驗分析影像 189 圖 5 - 8多人檢測系統實驗深度影像 189 圖 5 - 9多人檢測系統實驗 X軸座標 190 圖 5 - 10多人檢測系統實驗 Y軸座標 190 圖 5 - 11多人檢測系統實驗 Z軸座標 191 圖 5 - 12二值化分析結果 191 圖 5 - 13 物件追中重心結果 192 圖 5 - 14追蹤物件實驗呈現 193 圖 5 - 15追蹤時鐘物件實驗X方向座標 193 圖 5 - 16追蹤時鐘物件實驗Y方向座標 194 圖 5 - 17追蹤時鐘物件實驗Z方向座標 194 | - |
dc.language.iso | zh_TW | - |
dc.title | 立體數位影像相關法與深度學習系統整合應用於三維量測、姿態辨識、手臂控制 | zh_TW |
dc.title | Integration of Stereo Digital Image Correlation and Deep Learning Systems for Applications in Three-Dimensional Measurement, Pose Recognition, and Robotic Arm Control | en |
dc.type | Thesis | - |
dc.date.schoolyear | 111-2 | - |
dc.description.degree | 碩士 | - |
dc.contributor.oralexamcommittee | 林柏廷;楊朝龍 | zh_TW |
dc.contributor.oralexamcommittee | Po-Ting Lin;Chao-Lung Yang | en |
dc.subject.keyword | 立體數位影像相關法,深度學習於電腦視覺,人體姿態檢測,手勢辨識,即時多物件辨識追蹤, | zh_TW |
dc.subject.keyword | Stereo digital image correlation,Deep learning in computer vision,Human pose detection,gesture recognition,real-time multi-object tracking and recognition, | en |
dc.relation.page | 207 | - |
dc.identifier.doi | 10.6342/NTU202302248 | - |
dc.rights.note | 同意授權(限校園內公開) | - |
dc.date.accepted | 2023-08-12 | - |
dc.contributor.author-college | 工學院 | - |
dc.contributor.author-dept | 機械工程學系 | - |
顯示於系所單位: | 機械工程學系 |
文件中的檔案:
檔案 | 大小 | 格式 | |
---|---|---|---|
ntu-111-2.pdf 授權僅限NTU校內IP使用(校園外請利用VPN校外連線服務) | 23.95 MB | Adobe PDF | 檢視/開啟 |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。