Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 工學院
  3. 機械工程學系
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/88536
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor李志中zh_TW
dc.contributor.advisorJyh-Jone Leeen
dc.contributor.author周昱辰zh_TW
dc.contributor.authorYu-Chen Chouen
dc.date.accessioned2023-08-15T16:44:14Z-
dc.date.available2023-11-09-
dc.date.copyright2023-08-15-
dc.date.issued2023-
dc.date.submitted2023-07-26-
dc.identifier.citationZeng, A., Song, S., Yu, K. T., Donlon, E., Hogan, F. R., Bauza, M., ... & Rodriguez, A., “Robotic Pick-and-Place of Novel Objects in Clutter with Multi-Affordance Grasping and Cross-Domain Image Matching,” The International Journal of Robotics Research, 41(7), pp. 690-705, 2022.
Morrison, D., Corke, P., & Leitner, J., “Learning Robust, Real-Time, Reactive Robotic Grasping,” The International Journal of Robotics Research, 39(2-3), pp. 183-201, 2020.
“Cornell grasping dataset” From http://pr.cs.cornell.edu/grasping/rect_data/data.php, June 2023 accessed.
Shao, Q., Hu, J., Wang, W., Fang, Y., Liu, W., Qi, J., & Ma, J., “Suction Grasp Region Prediction Using Self-Supervised Learning for Object Picking in Dense Clutter,” IEEE 5th International Conference on Mechatronics System and Robots (ICMSR), Singapore, pp. 7-12, 2019.
He, K., Zhang, X., Ren, S., & Sun, J., “Deep Residual Learning for Image Recognition,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA , pp. 770-778, 2016.
Tachikake, H., & Watanabe, W., “A Learning-Based Robotic Bin-Picking with Flexibly Customizable Grasping Conditions,” IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Las Vegas, USA, pp. 9040-9047, 2020.
Zhu, J. Y., Park, T., Isola, P., & Efros, A. A., “Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks,” Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, pp. 2223-2232, 2017.
Jiang, P., Oaki, J., Ishihara, Y., Ooga, J., Han, H., Sugahara, A., & Ogawa, A., “Learning Suction Graspability Considering Grasp Quality and Robot Reachability for Bin-Picking,” Frontiers in Neurorobotics 16, 2022.
Wang, H., Situ, H., & Zhuang, C., “6D Pose Estimation for Bin-Picking Based on Improved Mask R-CNN and DenseFusion,” 26th IEEE International Conference on Emerging Technologies and Factory Automation (ETFA), Vasteras, Sweden, pp. 1-7, 2021.
B. Foundation. “Blender.” Available from: https://www.blender.org/
Xie, S., Girshick, R., Dollár, P., Tu, Z., & He, K., “Aggregated Residual Transformations for Deep Neural Networks,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, pp. 1492-1500, 2017.
He, K., Gkioxari, G., Dollár, P., & Girshick, R., “Mask R-CNN,” Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, pp. 2961-2969, 2017.
Wang, C., Xu, D., Zhu, Y., Martín-Martín, R., Lu, C., Fei-Fei, L., & Savarese, S., “DenseFusion: 6D Object Pose Estimation by Iterative Dense Fusion,” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Virtual Conference, pp. 3343-3352, 2019.
Wang, J. W., Li, C. L., Chen, J. L., & Lee, J. J., “Robot Grasping in Dense Clutter Via View-Based Experience Transfer,” International Journal of Intelligent Robotics and Applications, 6(1), pp. 23-37, 2022.
Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., Manzagol, P. A., & Bottou, L., “Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion,” Journal of Machine Learning Research, 11(12), 2010.
李佳蓮,“以實例切割與夾取點生成卷積類神經網路應用於隨機堆疊物件之分類夾取”,碩士論文,國立台灣大學機械工程研究所,台北市,台灣,2020。
Wang, C. H., & Lin, P. C., “Q-Pointnet: Intelligent Stacked-Objects Grasping Using a RGBD Sensor and a Dexterous Hand,” International Conference on Advanced Intelligent Mechatronics (AIM), Virtual Conference, pp. 601-606, 2020.
Qi, C. R., Su, H., Mo, K., & Guibas, L. J., “Pointnet: Deep Learning on Point Sets for 3D Classification and Segmentation,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, pp. 652-660, 2017.
Lee, S., & Lee, Y., “Real-Time Industrial Bin-Picking with a Hybrid Deep Learning-Engineering Approach,” IEEE International Conference on Big Data and Smart Computing (BigComp), Busan, Korea, pp. 584-588, 2020.
Redmon, J., & Farhadi, A., “Yolov3: An Incremental Improvement,” Computer Vision and Pattern Recognition, Salt Lake City, USA, pp. 1-6, 2018.
Sundermeyer, M., Marton, Z. C., Durner, M., Brucker, M., & Triebel, R., “Implicit 3D Orientation Learning for 6D Object Detection from RGB Images,” Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, pp. 699-715, 2018.
Zhao, W., Queralta, J. P., & Westerlund, T., “Sim-to-Real Transfer in Deep Reinforcement Learning for Robotics: a Survey,” Symposium series on Computational Intelligence (SSCI), Canberra, Australia, pp. 737-744, 2020.
Bousmalis, K., Irpan, A., Wohlhart, P., Bai, Y., Kelcey, M., Kalakrishnan, M., ... & Vanhoucke, V., “Using Simulation and Domain Adaptation to Improve Efficiency of Deep Robotic Grasping,” IEEE International Conference on Robotics and Automation (ICRA), Brisbane, Australia, pp. 4243-4250, 2018.
Pang, Y., Lin, J., Qin, T., & Chen, Z., “Image-to-Image Translation: Methods and Applications,” IEEE Transactions on Multimedia, 24, pp. 3859-3881, 2021.
Park, T., Efros, A. A., Zhang, R., & Zhu, J. Y., “Contrastive Learning for Unpaired Image-to-Image Translation,” Computer Vision–ECCV: 16th European Conference, August 23–28, Proceedings, Part IX 16, Glasgow, UK, pp. 319-345, 2020.
Tobin, J., Fong, R., Ray, A., Schneider, J., Zaremba, W., & Abbeel, P., “Domain Randomization for Transferring Deep Neural Networks from Simulation to the Real World,” IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Vancouver, Canada, pp. 23-30, 2017.
Van der Maaten, L., & Hinton, G., “Visualizing Data Using t-SNE,” Journal of Machine Learning Research, 9(11), 2008.
Soloveitchik, M., Diskin, T., Morin, E., & Wiesel, A., “Conditional Frechet Inception Distance,” arXiv preprint arXiv, pp.2103-11521, 2021.
Lin, T. Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., ... & Zitnick, C. L., “Microsoft Coco: Common Objects in Context,” Computer Vision–ECCV 2014: 13th European Conference, September 6-12, 2014, Proceedings, Part V 13, Zurich, Switzerland, pp. 740-755, 2014.
LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P., “Gradient-Based Learning Applied to Document Recognition,” Proceedings of the IEEE, Seattle, USA, 86(11), pp. 2278-2324, 1998.
Ronneberger, O., Fischer, P., & Brox, T., “U-net: Convolutional Networks for Biomedical Image Segmentation,” Medical Image Computing and Computer-Assisted Intervention–MICCAI: 18th International Conference, October 5-9, 2015, Proceedings, Part III 18, Munich, Germany, pp. 234-241, 2015.
Badrinarayanan, V., Kendall, A., & Cipolla, R., “Segnet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(12), pp. 2481-2495, 2017.
Girshick, R., Donahue, J., Darrell, T., & Malik, J., “Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, USA, pp. 580-587, 2014.
Girshick, R., “Fast R-CNN,” Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, pp. 1440-1448, 2015.
Chen, L. C., Papandreou, G., Kokkinos, I., Murphy, K., & Yuille, A. L., “Deeplab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected Crfs,” IEEE transactions on Pattern Analysis and Machine Intelligence, 40(4), pp. 834-848, 2017.
王俞程, “與類別無關之實例切割應用於未知堆疊物件之夾取”,碩士論文,國立台灣大學機械工程研究所,台北市,台灣,2022。
Bengio, Y., Louradour, J., Collobert, R., & Weston, J., “Curriculum Learning,” Proceedings of the 26th Annual International Conference on Machine Learning, Montreal, Canada, pp. 41-48, 2009.
Depierre, A., Dellandréa, E., & Chen, L., “Jacquard: A Large Scale Dataset for Robotic Grasp Detection,” IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Madrid, Spain, pp. 3511-3516, 2018.
-
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/88536-
dc.description.abstract在機器手臂夾取任務中,使用深度學習網路進行物件辨識往往需要實際拍攝任務場景,並在完成影像蒐集後以人工的方式標註資料集。此過程需要耗費大量的時間及人力,造成夾取任務應用的前置成本高且無法靈活更換產線。因此,本研究先透過渲染模擬軟體與目標物件之CAD模型合成大量的虛擬訓練資料集,同時自動完成資料集的標註。並以領域隨機化(Domain Randomization)與領域自適應(Domain Adaptation)中的風格轉換模型(Style Transfer Model),減少真實與虛擬影像之間的領域差異(Domain Gap),使虛擬資料集訓練的深度學習網路也能在真實場景中辨識目標物件。
本研究亦針對隨機堆疊物件提出一套自動化的夾取流程,整體夾取流程首先會透過RGB-D相機取得堆疊物件的平面及深度影像,接著使用實例分割模型(Mask-RCNN)將場景中的目標物件與背景分割並分類,將分割的目標物件經過由單物件夾取點生成卷積神經網路(GG-CNN)來進行夾取框預測的擴增式自動編碼網路(Augmented Autoencoder)後,即可獲得該物件的姿態及對應的夾取資訊,最後搭配深度影像進行夾取點干涉判定完成夾取流程。
最後會於真實堆疊場景中進行夾取實驗,以全自動化方式分別訓練夾取金屬圓管、T型塑膠水管、L型門把與混和物件的模型,實驗後分別具有:90.9%、92.0%、70.8%與87.0%的夾取成功率。
zh_TW
dc.description.abstractIn robotic grasping tasks, deep learning networks often require real-world scene datasets that are annotated by humans which leads to high upfront costs and a lack of flexibility in changing production lines for grasping tasks. Therefore, this study utilizes simulation software to generate a training dataset and automatically label it. As the model is trained by the synthetic dataset and used in the real world, domain randomization and domain adaptation techniques (such as style transfer) are employed to reduce the domain gap between real and synthetic data. Additionally, this study proposes an automated grasping system specifically designed for randomly stacked objects.
The pipeline of the system first uses instance segmentation model to segment each object in clutter, then applies an augmented autoencoder to find object pose and get several grasping candidates via Generative Grasping Convolutional Neural Network. Finally, to prevent gripping collision, depth information will be used to choose optimal grasping points for the robot.
To validate the system, an experimental setup was established in the real world. The success rates for grasping tubes, T-shaped pipes, L-shaped handles, and mixed objects were as follows: 90.9%, 92.0%, 70.8%, and 87.0%, respectively.
en
dc.description.provenanceSubmitted by admin ntu (admin@lib.ntu.edu.tw) on 2023-08-15T16:44:14Z
No. of bitstreams: 0
en
dc.description.provenanceMade available in DSpace on 2023-08-15T16:44:14Z (GMT). No. of bitstreams: 0en
dc.description.tableofcontents口試委員會審定書 i
誌謝 ii
中文摘要 iii
英文摘要 iv
第一章 緒論 1
1.1 前言 1
1.2 文獻回顧 1
1.2.1 單模型系統 2
1.2.2 多模型系統 6
1.3 研究動機與目的 9
1.4 論文架構 10
第二章 訓練資料集生成 11
2.1 虛擬資料集蒐集 11
2.1.1 渲染模擬軟體Blender 11
2.1.2 夾取物件模型 12
2.1.3 資料蒐集與標註 13
2.2 虛實資料集轉移 13
2.2.1 領域自適應 14
2.2.2 風格轉換模型CUT 15
2.2.3 領域隨機化 17
2.2.4 多階段學習模型隨機化 18
2.3 虛擬資料集成效 20
2.3.1 t-SNE 22
2.3.2 FID 23
第三章 物件分割 25
3.1 電腦視覺任務 25
3.1.1 影像辨識 26
3.1.2 語意分割 26
3.1.3 物件偵測 27
3.1.4 實例分割 29
3.2 實例分割模型Mask R-CNN 30
3.2.1 特徵提取網路 30
3.2.2 區域提取網路 31
3.2.3 感興趣區域對齊 32
3.2.4 預測分支 33
3.3 模型訓練 33
3.3.1 模型參數與損失函數 34
3.3.2 訓練資料集 35
3.3.3 課程學習 36
3.4 模型預測結果 38
3.4.1 mAP與mAR介紹 38
3.4.2 風格轉換模型成效比較 40
3.4.3 課程學習成效比較 41
3.4.4 模型預測展示 43
第四章 姿態估測與夾取預測 45
4.1 姿態歧異 45
4.2 擴增式自動編碼網路 46
4.2.1 自動編碼網路 46
4.2.2 虛擬訓練資料集 47
4.2.3 模型訓練 48
4.3 姿態估測 49
4.3.1 姿態資料集 49
4.3.2 姿態碼冊 50
4.3.3 模型預測結果 50
4.4 夾取預測 51
4.4.1 自動化基礎夾取框標註 52
4.4.2 夾取框擴增 55
4.4.3 碰撞偵測 56
第五章 系統與驗證 59
5.1 系統介紹 59
5.1.1 系統架構 59
5.1.2 實驗環境 59
5.1.3 夾取流程 60
5.2 夾取驗證 61
5.2.1 實驗流程 61
5.2.2 夾取結果 63
5.2.3 結果討論 64
第六章 結論與未來展望 65
6.1 結論 65
6.2 未來展望 66
參考資料 67
附錄A 金屬圓管尺寸圖 71
附錄B T形水管尺寸圖 72
附錄C L形門把尺寸圖 73
-
dc.language.isozh_TW-
dc.subject自動編碼網路zh_TW
dc.subject堆疊夾取zh_TW
dc.subject合成資料集zh_TW
dc.subject風格轉換zh_TW
dc.subject實例分割zh_TW
dc.subject姿態估測zh_TW
dc.subjectSynthetic Dataen
dc.subjectGrasping in clutteren
dc.subjectAugmented Autoencoderen
dc.subjectPose Estimationen
dc.subjectInstance Segmentationen
dc.subjectStyle Transferen
dc.title全自動化深度學習隨機堆疊物件夾取流程之建構zh_TW
dc.titleConstruction of an Automated Deep Learning Process for Random Bin Pickingen
dc.typeThesis-
dc.date.schoolyear111-2-
dc.description.degree碩士-
dc.contributor.oralexamcommittee黃漢邦;林峻永;施博仁zh_TW
dc.contributor.oralexamcommitteeHan-Pang Huang;Chun-Yeon Lin;Po-Jen Shihen
dc.subject.keyword堆疊夾取,合成資料集,風格轉換,實例分割,姿態估測,自動編碼網路,zh_TW
dc.subject.keywordGrasping in clutter,Synthetic Data,Style Transfer,Instance Segmentation,Pose Estimation,Augmented Autoencoder,en
dc.relation.page73-
dc.identifier.doi10.6342/NTU202301733-
dc.rights.note同意授權(限校園內公開)-
dc.date.accepted2023-07-27-
dc.contributor.author-college工學院-
dc.contributor.author-dept機械工程學系-
dc.date.embargo-lift2025-07-20-
顯示於系所單位:機械工程學系

文件中的檔案:
檔案 大小格式 
ntu-111-2.pdf
授權僅限NTU校內IP使用(校園外請利用VPN校外連線服務)
6.24 MBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved