內視鏡手術情境下的語意分割－以資料增強達到利用少量資
料訓練深層神經網路

Cheng-Shao Chiang; 蔣承劭

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/74958

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	施吉昇(Chi-Sheng Shih)
dc.contributor.author	Cheng-Shao Chiang	en
dc.contributor.author	蔣承劭	zh_TW
dc.date.accessioned	2021-06-17T09:11:16Z	-
dc.date.available	2024-09-03
dc.date.copyright	2019-09-03
dc.date.issued	2019
dc.date.submitted	2019-08-29
dc.identifier.citation	[1] H.-A. Chiang, C.-S. Chiang, and C.-S. Shih, “Using collaged data augmentation to train deep neural net with few data,” in International Conference on Medical Imaging with Deep Learning – Extended Abstract Track, London, United Kingdom, 08–10 Jul 2019. [Online]. Available: https://openreview.net/forum?id=Hkg8UwMRKE [2] L.-C. Chen, Y. Zhu, G. Papandreou, F. Schroff, and H. Adam, “Encoder-decoder with atrous separable convolution for semantic image segmentation,” in The European Conference on Computer Vision (ECCV), September 2018. [3] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” CoRR, vol. abs/1512.03385, 2015. [Online]. Available: http://arxiv.org/abs/1512.03385 [4] M. Heusel, H. Ramsauer, T. Unterthiner, B. Nessler, G. Klambauer, and S. Hochreiter, “GANs trained by a two time-scale update rule converge to a nash equilibrium,” CoRR, vol. abs/1706.08500, 2017. [Online]. Available: http://arxiv.org/abs/1706.08500 [5] J. Song, J. Wang, L. Zhao, S. Huang, and G. Dissanayake, “MIS-SLAM: real-time large scale dense deformable SLAM system in minimal invasive surgery based on heterogeneous computing,” CoRR, vol. abs/1803.02009, 2018. [Online]. Available: http://arxiv.org/abs/1803.02009 [6] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once: Unified, real-time object detection,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 779–788. [7] R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich feature hierarchies for accurate object detection and semantic segmentation,” in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2014. [8] J. Long, E. Shelhamer, and T. Darrell, “Fully convolutional networks for semantic segmentation,” in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2015. [9] H. Zhao, J. Shi, X. Qi, X. Wang, and J. Jia, “Pyramid scene parsing network,” in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017. [10] O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” in Medical Image Computing and ComputerAssisted Intervention – MICCAI 2015, N. Navab, J. Hornegger, W. M. Wells, and A. F. Frangi, Eds. Cham: Springer International Publishing, 2015, pp. 234–241. [11] M. Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler, R. Benenson, U. Franke, S. Roth, and B. Schiele, “The cityscapes dataset for semantic urban scene understanding,” in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2016. [12] M. Everingham, L. Van Gool, C. K. I. Williams, J. Winn, and A. Zisserman, “The PASCAL Visual Object Classes Challenge 2012 (VOC2012) Results,”http://www.pascal-network.org/challenges/VOC/voc2012/workshop/index.html. [13] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial nets,” in Advances in Neural Information Processing Systems 27, Z. Ghahramani, M. Welling, C. Cortes,N. D. Lawrence, and K. Q. Weinberger, Eds. Curran Associates, Inc., 2014, pp. 2672–2680. [Online]. Available: http://papers.nips.cc/paper/5423-generativeadversarial-nets.pdf [14] B. Münzer, K. Schoeffmann, and L. Böszörmenyi, “Content-based processing and analysis of endoscopic images and videos: A survey,” Multimedia Tools and Applications, vol. 77, no. 1, pp. 1323–1362, Jan 2018. [Online]. Available: https://doi.org/10.1007/s11042-016-4219-z [15] V. Badrinarayanan, A. Kendall, and R. Cipolla, “Segnet: A deep convolutional encoder-decoder architecture for image segmentation,” CoRR, vol. abs/1511.00561, 2015. [Online]. Available: http://arxiv.org/abs/1511.00561 [16] T. Karras, T. Aila, S. Laine, and J. Lehtinen, “Progressive growing of GANs for improved quality, stability, and variation,” CoRR, vol. abs/1710.10196, 2017. [Online]. Available: http://arxiv.org/abs/1710.10196 [17] T.-C. Wang, M.-Y. Liu, J.-Y. Zhu, A. Tao, J. Kautz, and B. Catanzaro, “Highresolution image synthesis and semantic manipulation with conditional GANs,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018. [18] D. Dwibedi, I. Misra, and M. Hebert, “Cut, paste and learn: Surprisingly easy synthesis for instance detection,” CoRR, vol. abs/1708.01642, 2017. [Online]. Available: http://arxiv.org/abs/1708.01642 [19] J. Lemley, S. Bazrafkan, and P. Corcoran, “Smart augmentation - learning an optimal data augmentation strategy,” CoRR, vol. abs/1703.08383, 2017. [Online]. Available: http://arxiv.org/abs/1703.08383 [20] K. He, X. Zhang, S. Ren, and J. Sun, “Spatial pyramid pooling in deep convolutional networks for visual recognition,” CoRR, vol. abs/1406.4729, 2014. [Online]. Available: http://arxiv.org/abs/1406.4729 [21] S. Ioffe and C. Szegedy, “Batch normalization: Accelerating deep network training by reducing internal covariate shift,” CoRR, vol. abs/1502.03167, 2015. [Online]. Available: http://arxiv.org/abs/1502.03167
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/74958	-
dc.description.abstract	隨著內視鏡手術的普及，愈來愈多研究著重在透過影像來輔助醫師進行手術。許多研究都是建立在常見的電腦視覺問題，例如物件辨識、同時定位及建立地圖等。本篇論文提出了一個資料增強的方法，來解決只擁有少量資料，如何訓練神經網路的問題。本篇論文中的應用場景為內視鏡手術情境下的語意分割，語意分割可以讓我們知道影像中出現了哪些器官等等的資訊，以利後續的其他應用。雖然語意分割已經有了大量的研究，但是這些現有的方法都需要大量的訓練資料，但是關於內視鏡手術的資料十分稀少以至於現有的演算法被侷限。實驗結果證明提出的資料增強方法可以有效的增加器官的辨識率。	zh_TW
dc.description.abstract	As the computer-aided surgery getting popular, more and more research has been conducted to help surgeons operate. Most of the research are focusing on common tasks with respect to computer vision and trying to provide surgeons with more information by analyzing the images captured, whereas in this thesis, we aim at the semantic segmentation in the endoscopy surgery scenario because semantic segmentation is the first step for a computer to grasp what shows up in the vision of an endoscope. Although semantic segmentation is a popular research topic, most of the current algorithm focus on road’s scene, which needs myriads of training data. Since the data endoscopy surgery scene is relatively scarce, the performance of existing algorithms is thus rather limited.Therefore, we tried to solve the problem of training a semantic segmentation network with few data in this work. We propose a data augmentation method that can synthesize new training data. The experiment results show that our method can improve the performance in recognizing anatomical objects effectively.	en
dc.description.provenance	Made available in DSpace on 2021-06-17T09:11:16Z (GMT). No. of bitstreams: 1 ntu-108-R06922131-1.pdf: 9369315 bytes, checksum: 093c08e1443f22f3b70571450a42b1d8 (MD5) Previous issue date: 2019	en
dc.description.tableofcontents	Acknowledgments ii 摘要 iii Abstract iv 1 Introduction 1 1.1 Motivation 1 1.2 Contribution 3 1.3 Thesis Organization 4 2 Background and Related Work 5 2.1 Background 5 2.1.1 Computer-Assisted Endoscopy Surgery 5 2.1.2 Semantic Segmentation 6 2.1.3 MICCAI 2018 Robotic Scene Segmentation Sub-Challenge 7 2.1.4 Generative Adversarial Network 7 2.1.5 Data Augmentation 8 3 System Architecture and Problem Definition 11 3.1 System Architecture 11 3.1.1 Network Architecture 11 3.1.2 Atrous Spatial Pyramid Module 13 3.1.3 ResNet 14 3.2 Problem Definition 16 4 Design and Implementation 17 4.1 Proposed Method 17 4.2 Design Details and Algorithms 18 5 Performance Evaluation 24 5.1 Performance Evaluation 24 5.1.1 Evaluation Metrics 24 5.1.2 Quantitative and Qualitative Results of Synthesized Images 26 5.1.3 Quantitative result of our data augmentation method 30 6 Conclusion 34 Bibliography 35
dc.language.iso	en
dc.subject	生成對抗網路	zh_TW
dc.subject	類神經網路	zh_TW
dc.subject	語意分割	zh_TW
dc.subject	深度學習	zh_TW
dc.subject	電腦輔助內視鏡手術	zh_TW
dc.subject	資料增強	zh_TW
dc.subject	Deep Learning	en
dc.subject	Semantic Segmentation	en
dc.subject	Generative Adversarial Network	en
dc.subject	Computer-Assisted Endoscopy Surgery	en
dc.subject	Neural Network	en
dc.subject	Data Augmentation	en
dc.title	內視鏡手術情境下的語意分割－以資料增強達到利用少量資料訓練深層神經網路	zh_TW
dc.title	Semantic Segmentation in Endoscopy Surgery: Using Data Augmentation to Train Deep Neural Net with Few Data	en
dc.type	Thesis
dc.date.schoolyear	107-2
dc.description.degree	碩士
dc.contributor.oralexamcommittee	辛賢楷,林忠緯
dc.subject.keyword	資料增強,生成對抗網路,電腦輔助內視鏡手術,類神經網路,語意分割,深度學習,	zh_TW
dc.subject.keyword	Data Augmentation,Semantic Segmentation,Generative Adversarial Network,Computer-Assisted Endoscopy Surgery,Neural Network,Deep Learning,	en
dc.relation.page	37
dc.identifier.doi	10.6342/NTU201903885
dc.rights.note	有償授權
dc.date.accepted	2019-08-30
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	資訊工程學研究所	zh_TW
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
ntu-108-1.pdf 未授權公開取用	9.15 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。