基於學習並使用超像素對之影像分割

Jin-Yu Huang; 黃晉禹

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/58961

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	丁建均(Jiang-Jiun Ding)
dc.contributor.author	Jin-Yu Huang	en
dc.contributor.author	黃晉禹	zh_TW
dc.date.accessioned	2021-06-16T08:41:17Z	-
dc.date.available	2020-07-27
dc.date.copyright	2020-07-27
dc.date.issued	2020
dc.date.submitted	2020-07-16
dc.identifier.citation	[1] F. Y. Shih and S. Cheng, “Automatic seeded region growing for color image segmentation,” Image and vision computing, vol. 23, no. 10, pp. 877–886, 2005. [2] D. Comaniciu and P. Meer, “Mean shift: A robust approach toward feature space analysis,” IEEE Transactions on pattern analysis and machine intelligence, vol. 24, no. 5, pp. 603–619, 2002. [3] P. Arbelaez, M. Maire, C. Fowlkes, and J. Malik, “From contours to regions: An empirical evaluation,” in 2009 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2009, pp. 2294–2301. [4] P. Arbelaez, M. Maire, C. Fowlkes, and J. Malik, “Contour detection and hierarchical image segmentation,” IEEE transactions on pattern analysis and machine intelligence, vol. 33, no. 5, pp.898–916, 2010. [5] T. Cour, F. Benezit, and J. Shi, “Spectral segmentation with multiscale graph decomposition,” in 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), vol. 2. IEEE, 2005, pp. 1124–1131. [6] J. Shi and J. Malik, “Normalized cuts and image segmentation,” IEEE Transactions on pattern analysis and machine intelligence, vol. 22, no. 8, pp. 888–905, 2000. [7] P. F. Felzenszwalb and D. P. Huttenlocher, “Efficient graph-based image segmentation,” International journal of computer vision, vol. 59, no. 2, pp. 167–181, 2004. [8] Z. Li, X.-M. Wu, and S.-F. Chang, “Segmentation using superpixels: A bipartite graph partitioning approach,” in 2012 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2012, pp. 789–796. [9] T. H. Kim, K. M. Lee, and S. U. Lee, “Learning full pairwise affinities for spectral segmentation,” IEEE transactions on pattern analysis and machine intelligence, vol. 35, no. 7, pp. 1690–1703, 2012. [10] Y. Yang, Y. Wang, and X. Xue, “A novel spectral clustering method with superpixels for image segmentation,” Optik, vol. 127, no. 1, pp. 161–167, 2016. [11] X. Xia and B. Kulis, “W-net: A deep model for fully unsupervised image segmentation,” arXiv preprint arXiv:1711.08506, 2017. [12] H. Noh, S. Hong, and B. Han, “Learning deconvolution network for semantic segmentation,” in Proceedings of the IEEE international conference on computer vision, 2015, pp. 1520–1528. [13] J. Long, E. Shelhamer, and T. Darrell, “Fully convolutional networks for semantic segmentation,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 3431–3440. [14] L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille, “Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs,” IEEE transactions on pattern analysis and machine intelligence, vol. 40, no. 4, pp. 834–848, 2017. [15] M.-Y. Liu, O. Tuzel, S. Ramalingam, and R. Chellappa, “Entropy rate superpixel segmentation,” in CVPR 2011. IEEE, 2011, pp. 2097–2104. [16] R. Achanta, A. Shaji, K. Smith, A. Lucchi, P. Fua, and S. S¨usstrunk, “Slic superpixels compared to state-of-the-art superpixel methods,” IEEE transactions on pattern analysis and machine intelligence, vol. 34, no. 11, pp. 2274–2282, 2012. [17] V. Jampani, D. Sun, M.-Y. Liu, M.-H. Yang, and J. Kautz, “Superpixel sampling networks,” in Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 352–368. [18] W.-C. Tu, M.-Y. Liu, V. Jampani, D. Sun, S.-Y. Chien, M.-H. Yang, and J. Kautz, “Learning superpixels with segmentation-aware affinity loss,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 568–576. [19] S. Xie and Z. Tu, “Holistically-nested edge detection,” in Proceedings of the IEEE international conference on computer vision, 2015, pp. 1395–1403. [20] D. Haehn, V. Kaynig, J. Tompkin, J. W. Lichtman, and H. Pfister, “Guided proofreading of automatic segmentations for connectomics,” in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018. [21] N. Agrawal, P. Sinha, A. Kumar, and S. Bagai, “Fast dynamic image restoration using laplace equation based image inpainting,” J Undergraduate Res Innovation, vol. 1, no. 2, pp. 115–123, 2015. [22] A. P. Kelm, V. S. Rao, and U. Zolzer, “Object contour and edge detection with refinecontournet,” in International Conference on Computer Analysis of Images and Patterns. Springer, 2019, pp. 246–258. [23] D. J. Field, “Relations between the statistics of natural images and the response properties of cortical cells,” Josa a, vol. 4, no. 12, pp. 2379–2394, 1987. [24] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778. [25] M. Donoser and D. Schmalstieg, “Discrete-continuous gradient orientation estimation for faster image segmentation,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2014, pp. 3158–3165. [26] C. J. Taylor, “Towards fast and accurate segmentation,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2013, pp. 1916–1922. [27] M. Everingham and J. Winn, “The pascal visual object classes challenge 2012 (voc2012) development kit,” Pattern Analysis, Statistical Modelling and Computational Learning, Tech. Rep, 2011. [28] P. Doll´ar and C. L. Zitnick, “Structured forests for fast edge detection,” in Proceedings of the IEEE international conference on computer vision, 2013, pp. 1841–1848. [29] L.-C. Chen, G. Papandreou, F. Schroff, and H. Adam, “Rethinking atrous convolution for semantic image segmentation,” arXiv preprint arXiv:1706.05587, 2017. [30] C. Szegedy, S. Ioffe, V. Vanhoucke, and A. A. Alemi, “Inception-v4, inception-resnet and the impact of residual connections on learning,” in Thirty-first AAAI conference on artificial intelligence, 2017. [31] F. Chollet, “Xception: Deep learning with depthwise separable convolutions,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 1251–1258.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/58961	-
dc.description.abstract	近期，卷積神經網路（CNN）在圖像分割中已被廣泛採用。但是，現有的基於CNN的影像分割算法多是以單一像素為單位進行預測。由於超像素的不規則形狀和尺寸，很難將超像素直接應用於CNN架構。在本文中，我們提出了多種轉換的機制來使得CNN學習基於超像素的圖像分割。首先提出的算法採用包含兩個超像素的正方形影像作為CNN的輸入，然後CNN的輸出結果是兩個超像素是否應該合併。另外，即使只有很少的訓練圖像，我們的方法也可以從中獲得大量的訓練數據。在第一種算法的啟發下，我們進一步提出了第二種算法來從不同角度出發，該算法利用全卷積網絡（FCN）來解決影像切割的問題。提出的第二種算法將堆疊有彩色圖像的多通道圖像以及諸如超像素邊界圖和邊緣檢測結果的幾個特徵圖作為深度神經網絡的輸入，並輸出超像素邊界圖的預測，該預測圖提供了兩個相鄰超像素的邊界是否應該消失或保留，並進一步地讓我們去執行超像素合併算法。也就是說，通過一次解決第一個算法中的所有子問題，FCN以較大的幅度增進了整個分割過程的速度，同時獲得了較高的精度。總體而言，模擬結果顯示，兩種提出的算法都可以實現非常高精度的影像分割結果，並且在所有評估指標上均優於最新的圖像分割方法。	zh_TW
dc.description.abstract	Recently, the CNN has been widely adopted in image segmentation. However, the existing CNN-based segmentation algorithms are pixel-wise. It is hard to apply superpixels into the CNN architectures directly due to the irregular shape and size of superpixels. In this paper, we proposed different kinds of transformation techniques that leverage the CNN for learning superpixel-based image segmentation. The first proposed algorithm takes a square patch that contains two superpixel as the input of the CNN, and then the output of the CNN is whether the two superpixels should be merged or not. Additionally, one can obtain huge amount of training data even if there are only a few training images. Inspired by the first algorithm, we further proposed the second algorithm that utilizes the fully convolutional networks (FCN) to solve the problem from different perspective. The second proposed algorithm takes a multi-channel image consisted of stacked color image and several feature maps such as superpixel boundary map and edge detection result as the input of a deep neural network, and the network outputs the prediction of superpixel boundary map that indicates whether the boundary of two adjacent superpixel should be keep or not, in a way, merging suprepixels. That is, by solving all the subproblems with just one forward pass, the FCN facilitates the speed of the whole segmentation process by a wide margin meanwhile gaining higher accuracy. Overall, simulations show that both proposed algorithms can achieve highly accurate segmentation results and outperforms state-of-the-art image segmentation methods in all evaluation metrics.	en
dc.description.provenance	Made available in DSpace on 2021-06-16T08:41:17Z (GMT). No. of bitstreams: 1 U0001-0807202019442500.pdf: 23286421 bytes, checksum: 133254d36632642a02e8a8513adf670a (MD5) Previous issue date: 2020	en
dc.description.tableofcontents	Abstract i List of Figures iv List of Tables vii 1 Introduction 1 2 Related Work 5 2.1 Superpixels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.1.1 Mean Shift Superpixel . . . . . . . . . . . . . . . . . . . 5 2.1.2 Superpixel Generation with Segmentation-Aware Affinity Loss (SEAL) Using Pixel Affinity Net (PAN) . . . . . . . 9 2.1.3 Superpixel Sampling Network (SSN) . . . . . . . . . . . 14 2.2 Classical Segmentation . . . . . . . . . . . . . . . . . . . . . . . 18 2.2.1 Segmentation Using Superpixel (SAS) . . . . . . . . . . . 18 2.2.2 Hierarchical Image Segmentation . . . . . . . . . . . . . 23 2.3 Deep Learning in Image Segmentation . . . . . . . . . . . . . . . 26 2.3.1 Fully Convolutional Networks (FCN) . . . . . . . . . . . 26 3 Proposed Algorithms: DMMSS 28 3.1 Two-Superpixel Patch Generation . . . . . . . . . . . . . . . . . 30 3.2 Training Architecture . . . . . . . . . . . . . . . . . . . . . . . . 33 3.3 Superpixel Pairing . . . . . . . . . . . . . . . . . . . . . . . . . . 35 3.4 Merging Procedure . . . . . . . . . . . . . . . . . . . . . . . . . 36 3.5 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 3.5.1 Segmentation Evaluation . . . . . . . . . . . . . . . . . . 38 3.5.2 Ablation Study . . . . . . . . . . . . . . . . . . . . . . . 39 4 Proposed Algorithms: DMMSS-FCN 48 4.1 5-channel Input Data . . . . . . . . . . . . . . . . . . . . . . . . 51 4.2 Output and GroundTruth . . . . . . . . . . . . . . . . . . . . . . 52 4.3 Training Architecture . . . . . . . . . . . . . . . . . . . . . . . . 56 4.4 Inference and Superpixel Merging . . . . . . . . . . . . . . . . . 57 4.5 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 4.5.1 Segmentation Evaluation . . . . . . . . . . . . . . . . . . 58 4.5.2 Run Time Analysis . . . . . . . . . . . . . . . . . . . . . 60 4.5.3 Ablation Study . . . . . . . . . . . . . . . . . . . . . . . 61 5 Simulations 68 5.1 BSDS500 Test Images . . . . . . . . . . . . . . . . . . . . . . . 68 5.2 Real-World Images . . . . . . . . . . . . . . . . . . . . . . . . . 81 5.2.1 Buildings . . . . . . . . . . . . . . . . . . . . . . . . . . 81 5.2.2 Animals . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 5.2.3 Night View . . . . . . . . . . . . . . . . . . . . . . . . . 82 5.2.4 Items and Objects . . . . . . . . . . . . . . . . . . . . . . 82 6 Conclusion 90 Reference 92
dc.language.iso	en
dc.title	基於學習並使用超像素對之影像分割	zh_TW
dc.title	Learning-Based Segmentation Using Superpixel Pairs	en
dc.type	Thesis
dc.date.schoolyear	108-2
dc.description.degree	碩士
dc.contributor.oralexamcommittee	王鈺強(Yu-Chiang Wang),郭景明(Jing-Ming Guo),張榮吉(Rong-Ji Zhang)
dc.subject.keyword	影像分割,全卷積神經網路,超像素,	zh_TW
dc.subject.keyword	Image Segmentation,Fully Convolutional Networks,Superpixel,	en
dc.relation.page	96
dc.identifier.doi	10.6342/NTU202001394
dc.rights.note	有償授權
dc.date.accepted	2020-07-16
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	電信工程學研究所	zh_TW
顯示於系所單位：	電信工程學研究所

文件中的檔案：

檔案	大小	格式
U0001-0807202019442500.pdf 目前未授權公開取用	22.74 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。