Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 工學院
  3. 機械工程學系
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/8352
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor李志中(Jyh-Jone Lee)
dc.contributor.authorChia-Lien Lien
dc.contributor.author李佳蓮zh_TW
dc.date.accessioned2021-05-20T00:52:35Z-
dc.date.available2025-08-14
dc.date.available2021-05-20T00:52:35Z-
dc.date.copyright2020-09-22
dc.date.issued2020
dc.date.submitted2020-08-15
dc.identifier.citation[1] Cornell University. 'Cornell grasping dataset.' http://pr.cs.cornell.edu/grasping/rect_data/data.php (accessed.
[2] H. Karaoguz and P. Jensfelt, 'Object Detection Approach for Robot Grasp Detection,' 2019 International Conference on Robotics and Automation (ICRA), pp. 4953-4959, 2019, doi: 10.1109/ICRA.2019.8793751.
[3] J. Ma, W. Shao, H. Ye, L. Wang, H. Wang, Y. Zheng, and X. Xue, 'Arbitrary-Oriented Scene Text Detection via Rotation Proposals,' IEEE transactions on multimedia., vol. 20, no. 11, pp. 3111-3122, 2018, doi: 10.1109/TMM.2018.2818020.
[4] D. Morrison, P. Corke, and J. Leitner, 'Closing the loop for robotic grasping: A real-time, generative grasp synthesis approach,' arXiv preprint arXiv:1804.05172, 2018.
[5] H. Liang, X. Ma, S. Li, M. Görner, S. Tang, B. Fang, F. Sun, and J. Zhang, 'Pointnetgpd: Detecting grasp configurations from point sets,' in 2019 International Conference on Robotics and Automation (ICRA), 2019: IEEE, pp. 3629-3635.
[6] C. R. Qi, H. Su, K. Mo, and L. J. Guibas, 'Pointnet: Deep learning on point sets for 3d classification and segmentation,' in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 652-660.
[7] B. Calli, A. Singh, A. Walsman, S. Srinivasa, P. Abbeel, and A. M. Dollar, 'The ycb object and model set: Towards common benchmarks for manipulation research,' in 2015 international conference on advanced robotics (ICAR), 2015: IEEE, pp. 510-517.
[8] A. ten Pas and R. Platt, 'Using geometry to detect grasp poses in 3d point clouds,' in Robotics Research: Springer, 2018, pp. 307-324.
[9] C. Cortes and V. Vapnik, 'Support-vector networks,' Machine learning, vol. 20, no. 3, pp. 273-297, 1995.
[10] L. Pinto and A. Gupta, 'Supersizing self-supervision: Learning to grasp from 50K tries and 700 robot hours,' in 2016 IEEE International Conference on Robotics and Automation (ICRA), 16-21 May 2016 2016, pp. 3406-3413, doi: 10.1109/ICRA.2016.7487517.
[11] A. Zeng, S. Song, K.-T. Yu, E. Donlon, F. R. Hogan, M. Bauza, D. Ma, O. Taylor, M. Liu, and E. Romo, 'Robotic pick-and-place of novel objects in clutter with multi-affordance grasping and cross-domain image matching,' in 2018 IEEE international conference on robotics and automation (ICRA), 2018: IEEE, pp. 1-8.
[12] J. Long, E. Shelhamer, and T. Darrell, 'Fully convolutional networks for semantic segmentation,' in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 3431-3440.
[13] 王仁蔚, '以實例切割與表徵學習應用於機械臂夾取堆疊物件,' in 機械工程學系, ed: 國立台灣大學, 2019.
[14] P. Vincent, H. Larochelle, Y. Bengio, and P.-A. Manzagol, 'Extracting and composing robust features with denoising autoencoders,' in Proceedings of the 25th international conference on Machine learning, 2008, pp. 1096-1103.
[15] J. Mahler, J. Liang, S. Niyaz, M. Laskey, R. Doan, X. Liu, J. A. Ojea, and K. Goldberg, 'Dex-net 2.0: Deep learning to plan robust grasps with synthetic point clouds and analytic grasp metrics,' arXiv preprint arXiv:1703.09312, 2017.
[16] K. Bousmalis, A. Irpan, P. Wohlhart, Y. Bai, M. Kelcey, M. Kalakrishnan, L. Downs, J. Ibarz, P. Pastor, and K. Konolige, 'Using simulation and domain adaptation to improve efficiency of deep robotic grasping,' in 2018 IEEE International Conference on Robotics and Automation (ICRA), 2018: IEEE, pp. 4243-4250.
[17] C. L. Alan L. Yuille. 'Limitations of Deep Learning for Vision, and How We Might Fix Them.' https://thegradient.pub/the-limitations-of-visual-deep-learning-and-how-we-might-fix-them/ (accessed 2020).
[18] K. He, G. Gkioxari, P. Dollár, and R. Girshick, 'Mask r-cnn,' in Proceedings of the IEEE international conference on computer vision, 2017, pp. 2961-2969.
[19] T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, and C. L. Zitnick, 'Microsoft coco: Common objects in context,' in European conference on computer vision, 2014: Springer, pp. 740-755.
[20] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, 'Imagenet: A large-scale hierarchical image database,' in 2009 IEEE conference on computer vision and pattern recognition, 2009: Ieee, pp. 248-255.
[21] A. Krizhevsky and G. Hinton, 'Learning multiple layers of features from tiny images,' 2009.
[22] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, 'You only look once: Unified, real-time object detection,' in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 779-788.
[23] S. Ren, K. He, R. Girshick, and J. Sun, 'Faster r-cnn: Towards real-time object detection with region proposal networks,' in Advances in neural information processing systems, 2015, pp. 91-99.
[24] D. Bolya, C. Zhou, F. Xiao, and Y. J. Lee, 'YOLACT: real-time instance segmentation,' in Proceedings of the IEEE International Conference on Computer Vision, 2019, pp. 9157-9166.
[25] K. He, X. Zhang, S. Ren, and J. Sun, 'Deep residual learning for image recognition,' in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770-778.
[26] R. S. Zimmermann and J. N. Siems, 'Faster training of Mask R-CNN by focusing on instance boundaries,' Computer Vision and Image Understanding, vol. 188, p. 102795, 2019.
[27] R. Girshick, 'Fast r-cnn,' in Proceedings of the IEEE international conference on computer vision, 2015, pp. 1440-1448.
[28] M. Danielczuk, M. Matl, S. Gupta, A. Li, A. Lee, J. Mahler, and K. Goldberg, 'Segmenting unknown 3d objects from real depth images using mask r-cnn trained on synthetic data,' in 2019 International Conference on Robotics and Automation (ICRA), 2019: IEEE, pp. 7283-7290.
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/8352-
dc.description.abstract本研究針對堆疊物件提出一套模組化的分類夾取流程,使用 RGB-D 相機取得物件堆疊的平面以及深度影像,經過實例切割模型(Mask-RCNN)及夾取點生成卷積類神經網路(Generative Grasping Convolutional Neural Network, GG-CNN),找出該堆疊中的多個夾取點,最後將所有物件的夾取點彙整至堆疊中,根據深度資訊篩選出不會與鄰物干涉的夾取點,並令機器手臂前往夾取。
在最初的分割步驟中,本研究選擇Mask R-CNN 對堆疊影像進行實例切割(Instance Segmentation),將物件從堆疊中逐一分離,取得堆疊中物件的位置以及類別資訊,並加入邊緣損失以取得更精確的邊緣輪廓。
第二步驟使用 GG-CNN 對單一物件的深度資訊生成像素級(Pixelwise)的夾取穩定度評分,此模型對於未知物件仍有預測夾取點的能力,因此在增加新的目標物件時,不需再更新此步驟的模型參數。
在第三步驟中透過深度影像,結合第一步驟的物件的位置資訊,以及第二步驟的夾取穩定度評分,剔除可能碰撞夾取點,並依據穩定度排序,即為本流程的最後輸出結果。最後,本研究並以一機器臂系統驗證此一流程之可行性,其夾取成功率可達84.3%。
zh_TW
dc.description.abstractThis thesis presents a robotic grasping and classification system for objects in cluttered environments. The system consists of three main parts: (i)instance segmentation, (ii)grasping candidates generation, and (iii)collision avoidance.
In the first part, the instance segmentation model, Mask R-CNN, isolates each cluttered object from the scene and is improved to obtain an accurate mask edge.
In the second part, Generative Grasping Convolutional Neural Network (GG-CNN) predicts the quality and grasps for every object, which is segmented in the first part. After that, the grasping candidates would be sampled from the pixel-wise prediction of GG-CNN.
In the last part, the algorithm selects collision-free grasps from the grasping candidates based on depth information. Finally, a robotic system is presented to illustrate the effectiveness of the process. It is shown that an 84.3% successful rate of grasp can be achieved.
en
dc.description.provenanceMade available in DSpace on 2021-05-20T00:52:35Z (GMT). No. of bitstreams: 1
U0001-3007202021210900.pdf: 4351030 bytes, checksum: 196617dda2657fa7aff0c74e21371c44 (MD5)
Previous issue date: 2020
en
dc.description.tableofcontents口試委員會審定書 i
誌謝 ii
摘要 iii
ABSTRACT iv
目錄 v
圖目錄 viii
表目錄 xi
第一章 前言 1
1-1背景 1
1-2文獻回顧 2
1-2-1 針對單一物件之夾取點預測 2
1-2-2 針對堆疊物件之夾取點預測 5
1-2-3 虛擬環境之夾取 8
1-3研究目的 11
1-4本文架構 12
第二章 物件分割 13
2-1 場景認知 (Scene understanding) 13
2-1-1 影像分類 14
2-1-2 物件定位 14
2-1-3 語意切割 15
2-1-4 實例切割 16
2-1-5 堆疊物件之場景認知 17
2-2 Mask R-CNN 18
2-2-1 特徵擷取 18
2-2-2 區域提案網路 21
2-2-3 RoIAlign 21
2-2-4 遮罩預測分支 22
2-2-5 邊界框預測分支 23
2-3 邊緣損失(Edge Agreement Loss) 24
2-4 訓練資料 26
2-4-1 資料收集 26
2-4-2 標註工具 26
2-4-3 資料標註 27
2-5 模型訓練 28
2-5-1 損失函數 28
2-5-2 預訓練權重 30
2-5-3 資料增強 30
2-5-3 超參數調整 31
2-6 模型預測結果 33
2-6-1 特徵擷取 33
2-6-2 區域提案網路之預測 34
2-6-3 遮罩之預測 35
2-6-4 實例切割結果 36
第三章 夾取點生成 37
3-1 夾取點生成卷積類神經網路 37
3-1-1 夾取點之定義 37
3-1-2 模型架構說明 39
3-2 模型訓練 40
3-2-1訓練資料 40
3-2-2 損失函數 42
3-3 模型預測結果 44
3-3-1 Cornell夾取資料集 44
3-3-2 切割之物件 45
3-4 候選夾取點 47
3-5 擴增候選夾取點 48
3-6 夾取點干涉判斷 49
3-7最終夾取點選擇 52
第四章 系統與驗證 54
4-1 系統說明 54
4-1-1 系統架構 54
4-1-2 實驗環境 55
4-2 夾取流程驗證 57
4-2-1 實驗流程 57
4-2-2 夾取結果與成功率 58
4-2-3 夾取流程運算時間 58
4-3 邊緣損失驗證 59
第五章 結論與未來展望 60
5-1 結論 60
5-2 未來展望 60
參考文獻 62
dc.language.isozh-TW
dc.title以實例切割與夾取點生成卷積類神經網路應用於隨機堆疊物件之分類夾取zh_TW
dc.titleRobotic Random Bin Picking and Classification System using Instance Segmentation and Generative Grasping Convolutional Neural Networken
dc.typeThesis
dc.date.schoolyear108-2
dc.description.degree碩士
dc.contributor.oralexamcommittee陳亮嘉(Liang-Chia Chen),林沛群(Pei-Chun Lin)
dc.subject.keyword機械手臂,堆疊夾取,實例切割,深度學習,zh_TW
dc.subject.keywordRobotic Arm,Clutter Grasping,Instance Segmentation,Deep Learning,en
dc.relation.page64
dc.identifier.doi10.6342/NTU202002128
dc.rights.note同意授權(全球公開)
dc.date.accepted2020-08-17
dc.contributor.author-college工學院zh_TW
dc.contributor.author-dept機械工程學研究所zh_TW
dc.date.embargo-lift2025-08-14-
顯示於系所單位:機械工程學系

文件中的檔案:
檔案 大小格式 
U0001-3007202021210900.pdf4.25 MBAdobe PDF檢視/開啟
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved