請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/1343完整後設資料紀錄
| DC 欄位 | 值 | 語言 |
|---|---|---|
| dc.contributor.advisor | 施吉昇(Chi-Sheng Shih) | |
| dc.contributor.author | Hsun-An Chiang | en |
| dc.contributor.author | 江洵安 | zh_TW |
| dc.date.accessioned | 2021-05-12T09:36:47Z | - |
| dc.date.available | 2019-08-23 | |
| dc.date.available | 2021-05-12T09:36:47Z | - |
| dc.date.copyright | 2018-08-23 | |
| dc.date.issued | 2018 | |
| dc.date.submitted | 2018-08-17 | |
| dc.identifier.citation | [1] J. Song, J. Wang, L. Zhao, S. Huang, and G. Dissanayake, “MIS-SLAM: real-time large scale dense deformable SLAM system in minimal invasive surgery based on heterogeneous computing,” CoRR, vol. abs/1803.02009, 2018. [Online]. Available: http://arxiv.org/abs/1803.02009
[2] N. Mahmoud, I. Cirauqui, A. Hostettler, C. Doignon, L. Soler, J. Marescaux, and J. M. M. Montiel, “Orbslam-based endoscope tracking and 3d reconstruction,”CoRR, vol. abs/1608.08149, 2016. [Online]. Available: http://arxiv.org/abs/1608.08149 [3] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once: Unified, real-time object detection,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 779–788. [4] S. Ren, K. He, R. Girshick, and J. Sun, “Faster r-cnn: towards real-time object detection with region proposal networks,” IEEE transactions on pattern analysis and machine intelligence, vol. 39, no. 6, pp. 1137–1149, 2017. [5] J. Long, E. Shelhamer, and T. Darrell, “Fully convolutional networks for semantic segmentation,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 3431–3440. [6] L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille, “Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs,” IEEE transactions on pattern analysis and machine intelligence, vol. 40, no. 4, pp. 834–848, 2018. [7] H. Zhao, J. Shi, X. Qi, X. Wang, and J. Jia, “Pyramid scene parsing network,” in IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 2881–2890. [8] A. Geiger, P. Lenz, and R. Urtasun, “Are we ready for autonomous driving? the kitti vision benchmark suite,” in Conference on Computer Vision and Pattern Recognition (CVPR), 2012. [9] M. Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler, R. Benenson, U. Franke, S. Roth, and B. Schiele, “The cityscapes dataset for semantic urban scene understanding,” CoRR, vol. abs/1604.01685, 2016. [Online]. Available: http://arxiv.org/abs/1604.01685 [10] G. J. Brostow, J. Fauqueur, and R. Cipolla, “Semantic object classes in video: A high-definition ground truth database,” Pattern Recognition Letters, vol. xx, no. x, pp. xx–xx, 2008. [11] O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” in International Conference on Medical image computing and computer-assisted intervention. Springer, 2015, pp. 234–241. [12] V. Badrinarayanan, A. Kendall, and R. Cipolla, “Segnet: A deep convolutional encoder-decoder architecture for image segmentation,” CoRR, vol. abs/1511.00561, 2015. [Online]. Available: http://arxiv.org/abs/1511.00561 [13] H. Zhao, X. Qi, X. Shen, J. Shi, and J. Jia, “Icnet for real-time semantic segmentation on high-resolution images,” CoRR, vol. abs/1704.08545, 2017. [Online]. Available: http://arxiv.org/abs/1704.08545 [14] A. Krizhevsky, I. Sutskever, and G. E. Hinton,“Imagenet classification with deep convolutional neural networks,” in Advances in neural information processing systems, 2012, pp. 1097–1105. [15] P. Y. Simard, D. Steinkraus, and J. C. Platt, “Best practices for convolutional neural networks applied to visual document analysis,” in null. IEEE, 2003, p. 958. [16] A. Gaidon, Q.Wang, Y. Cabon, and E. Vig, “Virtual worlds as proxy for multi-object tracking analysis,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 4340–4349. [17] A. Handa, V. Patraucean, V. Badrinarayanan, S. Stent, and R. Cipolla, “Understanding real world indoor scenes with synthetic data,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 4077–4085. [18] Y. Movshovitz-Attias, T. Kanade, and Y. Sheikh, “How useful is photo-realistic rendering for visual learning?” in European Conference on Computer Vision. Springer, 2016, pp. 202–217. [19] G. Ros, L. Sellart, J. Materzynska, D. Vazquez, and A. M. Lopez, “The synthia dataset: A large collection of synthetic images for semantic segmentation of urban scenes,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 3234–3243. [20] H. A. Alhaija, S. K. Mustikovela, L. Mescheder, A. Geiger, and C. Rother, “Augmented reality meets deep learning for car instance segmentation in urban scenes,”in British Machine Vision Conference, vol. 1, 2017, p. 2. [21] H. Zhang, M. Cisse, Y. N. Dauphin, and D. Lopez-Paz,“mixup: Beyond empirical risk minimization,” arXiv preprint arXiv:1710.09412, 2017. [22] B. Münzer, K. Schoeffmann, and L. Böszörmenyi,“Content-based processing and analysis of endoscopic images and videos: A survey,” Multimedia Tools and Applications, vol. 77, no. 1, pp. 1323–1362, 2018. [23] V. Vitiello, S.-L. Lee, T. P. Cundy, and G.-Z. Yang, “Emerging robotic platforms for minimally invasive surgery,” IEEE reviews in biomedical engineering, vol. 6, pp.111–126, 2013. [24] G. S. Guthart and J. K. Salisbury, “The intuitive/sup tm/telesurgery system: overview and application,” in Robotics and Automation, 2000. Proceedings. ICRA’00. IEEE International Conference on, vol. 1. IEEE, 2000, pp.618–621. [25] A. Casals, J. Amat, and E. Laporte, “Automatic guidance of an assistant robot in laparoscopic surgery,” in Robotics and Automation, 1996. Proceedings., 1996 IEEE International Conference on, vol. 1. IEEE, 1996, pp. 895–900. [26] S. Voros, G.-P. Haber, J.-F. Menudet, J.-A. Long, and P. Cinquin, “Viky robotic scope holder: Initial clinical experience and preliminary results using instrument tracking,” IEEE/ASME transactions on mechatronics, vol. 15, no. 6, pp. 879–886, 2010. [27] S. Voros, J.-A. Long, and P. Cinquin, “Automatic detection of instruments in laparoscopic images: A first step towards high-level command of robotic endoscopic holders,” The International Journal of Robotics Research, vol. 26, no. 11-12, pp.1173–1190, 2007. [28] X. Zhang and S. Payandeh, “Application of visual tracking for robot-assisted laparoscopic surgery,” Journal of Robotic systems, vol. 19, no. 7, pp. 315–328, 2002. [29] P. Marayong, M. Li, A. M. Okamura, and G. D. Hager, “Spatial motion constraints: Theory and demonstrations for robot guidance using virtual fixtures,” in Robotics and Automation, 2003. Proceedings. ICRA’03. IEEE International Conference on,vol. 2. IEEE, 2003, pp. 1954–1959. [30] S. Park, R. D. Howe, and D. F. Torchiana, “Virtual fixtures for robotic cardiac surgery,” in International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, 2001, pp. 1419–1420. [31] A. Krupa, J. Gangloff, C. Doignon, M. F. De Mathelin, G. Morel, J. Leroy, L. Soler, and J. Marescaux, “Autonomous 3-d positioning of surgical instruments in robotized laparoscopic surgery using visual servoing,” IEEE transactions on robotics and automation, vol. 19, no. 5, pp. 842–853, 2003. [32] N. Padoy and G. D. Hager, “Human-machine collaborative surgery using learned models,” in Robotics and Automation (ICRA), 2011 IEEE International Conference on. IEEE, 2011, pp. 5285–5292. [33] C. Staub, T. Osa, A. Knoll, and R. Bauernschmitt, “Automation of tissue piercing using circular needles and vision guidance for computer aided laparoscopic surgery,” in Robotics and Automation (ICRA), 2010 IEEE International Conference on. IEEE, 2010, pp. 4585–4590. [34] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,”in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778. [35] K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv preprint arXiv:1409.1556, 2014. [36] M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, M. Isard et al., “Tensorflow: a system for large-scale machine learning.”in OSDI, vol. 16, 2016, pp. 265–283. [37] hellochick, “hellochick/pspnet-tensorflow,” [Online; accessed 20-July-2018]. [Online]. Available: https://github.com/hellochick/PSPNet-tensorflow [38] C. for Open Medical Image Computing, “Endovissub2018- roboticscenesegmentation,” [Online; accessed 20-July-2018]. [Online]. Available:https://endovissub2018-roboticscenesegmentation.grand-challenge.org/home/ | |
| dc.identifier.uri | http://tdr.lib.ntu.edu.tw/handle/123456789/1343 | - |
| dc.description.abstract | 隨著以內視鏡微創手術的興起,許多研究試著透過分析處理內視鏡影像來提供手術人員即時的協助。其中,我們試著處理內視鏡影像的語意分割問題,因為語意分割可以為許多應用提供重要的資訊,像是虛擬實境(VR)、擴增實境(AR)或是內視鏡的同時定位與建地圖(SLAM)。語意分割在都市場景或自然場景上已經累積了大量的研究,但是因為缺乏充足的學習資料,鮮少有研究觸及內視鏡手術。在這篇研究中,我們提出了一個資料增強的方法,和一種作用在網路中間層的監督模式,用來解決使用少量資料訓練深度網路的問題。實驗結果證明提出的方法可以比常用的資料增強方法更有效的提升網路的準確度。 | zh_TW |
| dc.description.abstract | With the increasing popularity of endoscope-based minimal-invasive
surgery, many have tried to provide surgeons real-time assistance by processing video frames from endoscope. We aim at a particular problem, endoscopy semantic segmentation, which can provide important information for other applications like VR or endoscopy-SLAM. While semantic segmentation in other scenarios, e.g. urban scene or natural scene, has been intensively studied, seldom has reach the area of endoscopy surgery due to lack of large-scale, finely annotated dataset. In this work, we tried to solve the problem of training deep neural network with few training data in the case of endoscopy surgery, by introducing an aggressive data augmentation technique, and additional loss term which is applied on intermediate layers of network. Experiment results show that our proposed methods can improve network performance more effectively than commonly used data augmentation on endoscopy surgery dataset, and improve performance of state-of-the-art network by 4.67 in terms of mIoU(%) | en |
| dc.description.provenance | Made available in DSpace on 2021-05-12T09:36:47Z (GMT). No. of bitstreams: 1 ntu-107-R04944017-1.pdf: 9991868 bytes, checksum: cf3780856f1a16610362213a5eb63c4b (MD5) Previous issue date: 2018 | en |
| dc.description.tableofcontents | Contents
口試委員會審定書 i Acknowledgments ii 摘要 iii Abstract iv 1 Introduction 1 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.3 Thesis Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2 Background and Related Work 5 2.1 Semantic Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.2 Data Augmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.3 Computer-assisted Endoscopy Surgery . . . . . . . . . . . . . . . . . . . 7 3 Problem Definition and System Architecture 9 3.1 Problem Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 3.1.1 Semantic Segmentation in Endoscopy Environment . . . . . . . . 9 3.1.2 Issues in Endoscopy Environment . . . . . . . . . . . . . . . . . 10 3.2 System Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 3.2.1 Network Architecture . . . . . . . . . . . . . . . . . . . . . . . . 12 3.2.2 ResNet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 3.2.3 Pyramid Pooling Module . . . . . . . . . . . . . . . . . . . . . . 14 4 Design and Implementation 15 4.1 Collage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 4.2 Auxiliary Loss With Meta-Class . . . . . . . . . . . . . . . . . . . . . . 17 5 Performance Evaluation 20 5.1 Experiment Environment . . . . . . . . . . . . . . . . . . . . . . . . . . 20 5.2 Evaluation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 5.2.1 Ablation Study for Commonly Used Data Augmentation . . . . . 21 5.2.2 Ablation Study for Collage . . . . . . . . . . . . . . . . . . . . . 21 5.2.3 Ablation Study for Auxiliary Loss with Meta-Class . . . . . . . . 23 5.2.4 Combined Study . . . . . . . . . . . . . . . . . . . . . . . . . . 24 6 Conclusion 26 Bibliography 27 | |
| dc.language.iso | en | |
| dc.subject | 語意分割 | zh_TW |
| dc.subject | 中間層監督 | zh_TW |
| dc.subject | 深度學習 | zh_TW |
| dc.subject | 資料增強 | zh_TW |
| dc.subject | 類神經網路 | zh_TW |
| dc.subject | 微創手術 | zh_TW |
| dc.subject | 內視鏡手術 | zh_TW |
| dc.subject | Intermediate Layer Supervision | en |
| dc.subject | Endoscopy Surgery | en |
| dc.subject | Robot-assisted Surgery | en |
| dc.subject | Semantic Segmentation | en |
| dc.subject | Deep Learning | en |
| dc.subject | Neural Network | en |
| dc.subject | Data Augmentation | en |
| dc.subject | Auxiliary Loss | en |
| dc.title | 內視鏡手術的語意分割:利用資料增強與中間層監督達到以少量資料訓練深層網路 | zh_TW |
| dc.title | Semantic Segmentation in Endoscopy Surgery: Using Data Augmentation and Intermediate Layer Supervision to Train Deep Neural Net with Few Data | en |
| dc.type | Thesis | |
| dc.date.schoolyear | 106-2 | |
| dc.description.degree | 碩士 | |
| dc.contributor.oralexamcommittee | 楊佳玲(Chia-Lin Yang),徐慰中(Wei-Chung Hsu) | |
| dc.subject.keyword | 內視鏡手術,微創手術,語意分割,資料增強,深度學習,類神經網路,中間層監督, | zh_TW |
| dc.subject.keyword | Endoscopy Surgery,Robot-assisted Surgery,Semantic Segmentation,Deep Learning,Neural Network,Data Augmentation,Auxiliary Loss,Intermediate Layer Supervision, | en |
| dc.relation.page | 30 | |
| dc.identifier.doi | 10.6342/NTU201803348 | |
| dc.rights.note | 同意授權(全球公開) | |
| dc.date.accepted | 2018-08-17 | |
| dc.contributor.author-college | 電機資訊學院 | zh_TW |
| dc.contributor.author-dept | 資訊網路與多媒體研究所 | zh_TW |
| 顯示於系所單位: | 資訊網路與多媒體研究所 | |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| ntu-107-1.pdf | 9.76 MB | Adobe PDF | 檢視/開啟 |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
