請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/86667完整後設資料紀錄
| DC 欄位 | 值 | 語言 |
|---|---|---|
| dc.contributor.advisor | 葉仲基(Chung-Kee Yeh) | |
| dc.contributor.author | Shao-Yun Wu | en |
| dc.contributor.author | 吳少云 | zh_TW |
| dc.date.accessioned | 2023-03-20T00:10:01Z | - |
| dc.date.copyright | 2022-08-10 | |
| dc.date.issued | 2022 | |
| dc.date.submitted | 2022-08-03 | |
| dc.identifier.citation | 歐昱誠。2020。自走車之基於 DDPG 的室內路徑規劃與避障。碩士論文。國立中興大學電機工程學系。 劉冠伯。2020。即時定位建圖及導航技術應用於設施內智慧載具。碩士論文。國立台灣大學生物機電工程學系。 戴君彥。2020。設施內帶有環境資訊地圖之建立。碩士論文。國立台灣大學生物機電工程學系。 IT人。2020。深度剖析目標檢測演算法YOLOV4。網址:https://iter01.com/568038.html。上網日期:2021-8-20。 Alexey, B., C. Y. Wang, and H. Y. M. Liao. 2020. Yolov4: Optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934. Arnaud D., N de Freitas, M. Kevin, R. Stuart. 2013. Rao-Blackwellised Particle Filtering for Dynamic Bayesian Networks. arXiv preprint arXiv:1301.3853. Babaeizadeh, M., I. Frosio, S. Tyree, J. Clemons and J. Kautz. 2016. Reinforcement learning through asynchronous advantage actor-critic on a gpu. arXiv preprint arXiv:1611.06256. CampusAI. 2021. Distributed RL. Available at: https://campusai.github.io/lectures/lecture17. Accessed 2021-9-25. Fox, D., W. Burgard, and S. Thrun. 1997. The dynamic window approach to collision avoidance. IEEE Robotics & Automation Magazine, 4(1), 23-33. Grisetti, G., C. Stachniss and W. Burgard. 2007. Improved techniques for grid mapping with rao-blackwellized particle filters. IEEE transactions on Robotics, 23(1), 34-46. Hart, P. E., N. J. Nilsson, B. Raphael. 1968. A formal basis for the heuristic determination of minimum cost paths. IEEE Transactions on Systems Science and Cybernetics. 4 (2): 100-107. Kang, C., C. Rong, W. Ren, F. Huo and P. Liu. 2021. Deep deterministic policy gradient based on double network prioritized experience replay. IEEE Access, 9, 60296-60308. Konda, V. R., and J. N. Tsitsiklis. 2000. Actor-critic algorithms. Advances in neural information processing systems: 1008-1014. Lillicrap, T. P., J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa and D. Wierstra. 2015. Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971. Mnih, V., K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra, and M. Riedmiller. 2013. Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602. Murphy, K., and S. Russell. 2001. Rao-Blackwellised particle filtering for dynamic Bayesian networks. In Sequential Monte Carlo methods in practice: 499-515. Niu, H., Z. Ji, F. Arvin, B. Lennox, H. Yin and J. Carrasco. 2021. Accelerated sim-to-real deep reinforcement learning: Learning collision avoidance from human player. IEEE/SICE International Symposium on System Integration (SII): 144-149. ProgrammerSought. 2021. Structure diagram of YOLOv3 and V4. Available at: https://www.programmersought.com/article/27904460756/. Accessed 2021-08-29. ROBOTIS. 2021. Robotis e-Manual. Available at: https://emanual.robotis.com/docs/en/platform/turtlebot3/appendix_lds_01/. Accessed 2021-09-18. Stentz, A. 1997. Optimal and efficient path planning for partially known environments. In Intelligent unmanned ground vehicles: 203-220. Springer, Boston, MA. Silver, D., G. Lever, N. Heess, T. Degris, D. Wierstra and M. Riedmiller. 2014. Deterministic policy gradient algorithms. Proceedings of Machine Learning Research 32(1): 387-395. Sutton, R. S., D. A. McAllester, S. P. Singh and Y. Mansour. 2000. Policy gradient methods for reinforcement learning with function approximation. Advances in neural information processing systems: 1057-1063. Towards Data Science. 2018. The Unscented Kalman Filter: Anything EKF can do I can do it better. Available at: https://towardsdatascience.com/the-unscented-kalman-filteranything-ekf-can-do-i-can-do-it-better-ce7c773cf88d. Accessed 2021-09-24. | |
| dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/86667 | - |
| dc.description.abstract | 使用無人載具於溫室中以自動化技術來觀察果物資訊,室內導航及物件辨識為兩項關鍵技術。載具導航中,即便許多導航演算法能順利計算出全局路徑規劃,卻因缺乏即時性局部行為規劃的特性,當障礙物分佈出現變化時便失去導航能力。果物辨識中,當果物被些微遮蔽或過於靠近時,辨識之結果往往不如預期。本研究藉由一台履帶載具車結合光達感測資訊實現深度確定性策略梯度 (Deep Deterministic Policy Gradient, DDPG) 強化學習,使其處理載具過於靠近障礙物的情境。在A*全局路徑規劃下,DDPG 能協助動態視窗法(Dynamic Window Approach, DWA) 局部行為規劃演算法的表現,進而使無人載具能適應於動態環境。一旦載具抵達目標座標,再以YOLOv4 深度學習模型使用載具上相機獲取之即時影像進行果物辨識,並且以分群演算法及座標轉換之技術於即時3D地圖中呈現果物資訊。實驗結果得知載具導航在介入距離0.35至0.36 m的DDPG協助之下,能即時調整行為決策,並避免 DWA 演算法於極端情境下導航發生中斷或載具原地打轉之現象,有效降低行走時間約20%。溫室資訊地圖中,YOLOv4與分群演算法處理後,能夠呈現0.2 m誤差的獨立果物座標與85%平均準確度的成熟狀態。此研究為智慧農業的應用上帶來更有效率的管理以及更穩定的系統。 | zh_TW |
| dc.description.abstract | Indoor navigation and object recognition are two key technologies for an unmanned vehicle observing fruit information in the greenhouse. With navigation, even though many navigation algorithms are able to make the global path planning successfully, they usually fail when the distribution of obstacles changes. With object recognition, the recognition results are often not as expected when fruits are slightly obscured or too close to each other. This research dedicated to the implementation of DDPG (Deep Deterministic Policy Gradient), a reinforcement learning method, for the crawler unmanned vehicle with lidar sensor in order to handle the situation where vehicle is too close to obstacles. With global path planning A* algorithm, DDPG can assist the DWA (Dynamic Window Approach) local behavior planning algorithm, so that the unmanned vehicle can adapt to the dynamic environment. Once the vehicle reaches the target coordinate, the deep learning model YOLOv4 identify fruits using real-time image obtained by the webcam. After that, fruits identification results will be clustered and their coordinate will be transformed and finally presented in the real-time 3D information map. The experimental results show that the vehicle can adjust the behavioral policy in real time with 0.35 to 0.36 m intervention distance DDPG involved, this improvement saved 20% of time, and avoided the navigation being interrupted or the vehicle spinning in place only handled by DWA algorithm. Besides, the individual fruits can be clearly displayed on the greenhouse information map with 0.2 m coordinate error and 85% average accuracy maturity status by YOLOv4 and clustering algorithm. This work enhanced the smart agriculture application with more efficient management and more robust system. | en |
| dc.description.provenance | Made available in DSpace on 2023-03-20T00:10:01Z (GMT). No. of bitstreams: 1 U0001-0208202222205100.pdf: 4310187 bytes, checksum: 8233784a3f282ca7faf2630141fbfe94 (MD5) Previous issue date: 2022 | en |
| dc.description.tableofcontents | 誌謝 II 摘要 III ABSTRACT IV 目錄 V 圖目錄 VIII 表目錄 X 第一章 緒論 1 1.1前言 1 1.2研究目的 1 第二章 文獻探討 3 2.1 即時定位與建圖 3 2.1.1 Rao-Blackwellised 粒子濾波器 3 2.1.2 Gmapping 4 2.1.3 自適應蒙地卡羅定位法 5 2.2 導航與路徑規劃 5 2.2.1 A*演算法 6 2.2.2動態視窗法 6 2.3 強化學習 7 2.3.1 強化學習之基本概念 7 2.3.2 DQN (Deep Q-Learning Network) 9 2.3.3 Actor Critic 10 2.3.4 DDPG深度確定性策略梯度(Deep Deterministic Policy Gradient) 11 2.4 影像辨識 14 2.4.1 YOLOv4 14 第三章 材料與方法 17 3.1 系統架構 17 3.1.1 軟體架構 18 3.1.2 硬體架構 20 3.2 實驗環境 23 3.2.1 模擬環境 24 3.3 DDPG無人載具導航 26 3.3.1 設計獎賞 26 3.3.2 定義狀態與行動 27 3.3.3 訓練流程 27 3.3.4 真實系統架構 29 3.4 目標辨識 31 3.4.1 YOLOv4 31 3.4.2座標轉換 33 3.4.3 貝氏高斯混合模型 35 第四章 結果與討論 36 4.1 DDPG訓練結果 36 4.1.1 靜態模擬環境下之DDPG 37 4.1.2 單動態障礙物模擬環境下之DDPG 39 4.1.3 多重動態障礙物模擬環境下之DDPG 41 4.2 DDPG搭配DWA之載具導航效能分析 43 4.2.1 靜態模擬環境路徑規劃步數測試 43 4.2.2 動態模擬環境路經規劃步數測試 46 4.3 YOLOV4訓練與測試 49 4.4 物件影像辨識與標記 52 4.4.1 模擬環境中之分群結果 52 4.4.2 模擬環境中之誤差測試與分析 53 4.5 實際場域測試結果 55 4.5.1 真實載具於動態環境導航測試 55 4.5.2 實際場域之溫室資訊地圖建立 56 第五章 結論與建議 59 5.1 結論 59 5.2 建議 60 參考文獻 61 | |
| dc.language.iso | zh-TW | |
| dc.subject | 導航 | zh_TW |
| dc.subject | 深度確定性策略梯度 | zh_TW |
| dc.subject | 目標辨識 | zh_TW |
| dc.subject | 環境資訊地圖 | zh_TW |
| dc.subject | 導航 | zh_TW |
| dc.subject | 深度確定性策略梯度 | zh_TW |
| dc.subject | 目標辨識 | zh_TW |
| dc.subject | 環境資訊地圖 | zh_TW |
| dc.subject | DDPG | en |
| dc.subject | Navigation | en |
| dc.subject | DDPG | en |
| dc.subject | Object recognition | en |
| dc.subject | Environment information map | en |
| dc.subject | Navigation | en |
| dc.subject | Object recognition | en |
| dc.subject | Environment information map | en |
| dc.title | 強化深度學習應用於溫室無人載具導航及物件影像辨識 | zh_TW |
| dc.title | Deep Reinforcement Learning for Unmanned Vehicle Navigation and Image Recognition in Greenhouse | en |
| dc.type | Thesis | |
| dc.date.schoolyear | 110-2 | |
| dc.description.degree | 碩士 | |
| dc.contributor.oralexamcommittee | 黃振康(Chen-Kang Huang),吳剛智(Gang-Jhy Wu) | |
| dc.subject.keyword | 導航,深度確定性策略梯度,目標辨識,環境資訊地圖, | zh_TW |
| dc.subject.keyword | Navigation,DDPG,Object recognition,Environment information map, | en |
| dc.relation.page | 64 | |
| dc.identifier.doi | 10.6342/NTU202201990 | |
| dc.rights.note | 同意授權(全球公開) | |
| dc.date.accepted | 2022-08-03 | |
| dc.contributor.author-college | 生物資源暨農學院 | zh_TW |
| dc.contributor.author-dept | 生物機電工程學系 | zh_TW |
| dc.date.embargo-lift | 2022-08-10 | - |
| 顯示於系所單位: | 生物機電工程學系 | |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| U0001-0208202222205100.pdf | 4.21 MB | Adobe PDF | 檢視/開啟 |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
