請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/85434完整後設資料紀錄
| DC 欄位 | 值 | 語言 |
|---|---|---|
| dc.contributor.advisor | 徐宏民(Winston Hsu) | |
| dc.contributor.author | Tsung-Han Wu | en |
| dc.contributor.author | 吳宗翰 | zh_TW |
| dc.date.accessioned | 2023-03-19T23:16:33Z | - |
| dc.date.copyright | 2022-07-22 | |
| dc.date.issued | 2022 | |
| dc.date.submitted | 2022-07-18 | |
| dc.identifier.citation | [1] Radhakrishna Achanta, Appu Shaji, Kevin Smith, Aurelien Lucchi, Pascal Fua, and Sabine Süsstrunk. Slic superpixels compared to state-of-the-art superpixel methods. IEEE transactions on pattern analysis and machine intelligence, 34(11):2274–2282, 2012. [2] Iro Armeni, Ozan Sener, Amir R Zamir, Helen Jiang, Ioannis Brilakis, Martin Fischer, and Silvio Savarese. 3d semantic parsing of large-scale indoor spaces. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 1534–1543, 2016. [3] Jordan T Ash, Chicheng Zhang, Akshay Krishnamurthy, John Langford, and Alekh Agarwal. Deep batch active learning by diverse, uncertain gradient lower bounds. In ICLR, 2020. [4] Matan Atzmon, Haggai Maron, and Yaron Lipman. Point convolutional neural networks by extension operators. ACM Trans. Graph., 37(4), July 2018. [5] Dena Bazazian, Josep R Casas, and Javier Ruiz-Hidalgo. Fast and robust edge extraction in unorganized point clouds. In 2015 international conference on digital image computing: techniques and applications (DICTA), pages 1–8. IEEE, 2015. [6] Jens Behley, Martin Garbade, Andres Milioto, Jan Quenzel, Sven Behnke, Cyrill Stachniss, and Jurgen Gall. Semantickitti: A dataset for semantic scene understanding of lidar sequences. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 9297–9307, 2019. [7] Arantxa Casanova, Pedro O. Pinheiro, Negar Rostamzadeh, and Christopher J. Pal. Reinforced active learning for image segmentation. In International Conference on Learning Representations, 2020. [8] Wei-Lun Chang, Hui-Po Wang, Wen-Hsiao Peng, and Wei-Chen Chiu. All about structure: Adapting structural information across domains for boosting semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1900–1909, 2019. [9] Liang-Chieh Chen, George Papandreou, Iasonas Kokkinos, Kevin P. Murphy, and Alan Loddon Yuille. Semantic image segmentation with deep convolutional nets and fully connected crfs. CoRR, abs/1412.7062, 2015. [10] Liang-Chieh Chen, Yukun Zhu, George Papandreou, Florian Schroff, and Hartwig Adam. Encoder-decoder with atrous separable convolution for semantic image segmentation. In Proceedings of the European conference on computer vision (ECCV), pages 801–818, 2018. [11] Shuaijun Chen, Xu Jia, Jianzhong He, Yongjie Shi, and Jianzhuang Liu. Semisupervised domain adaptation based on dual-level domain mixing for semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11018–11027, 2021. [12] Yiting Cheng, Fangyun Wei, Jianmin Bao, Dong Chen, Fang Wen, and Wenqiang Zhang. Dual path learning for domain adaptation of semantic segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 9082–9091, 2021. [13] Christopher Choy, JunYoung Gwak, and Silvio Savarese. 4d spatio-temporal convnets: Minkowski convolutional neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3075–3084, 2019. [14] Marius Cordts, Mohamed Omran, Sebastian Ramos, Timo Rehfeld, Markus Enzweiler, Rodrigo Benenson, Uwe Franke, Stefan Roth, and Bernt Schiele. The cityscapes dataset for semantic urban scene understanding. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3213–3223, 2016. [15] Angela Dai, Angel X Chang, Manolis Savva, Maciej Halber, Thomas Funkhouser, and Matthias Nießner. Scannet: Richly-annotated 3d reconstructions of indoor scenes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 5828–5839, 2017. [16] LiangDu,JingangTan,HongyeYang,JianfengFeng,XiangyangXue,QibaoZheng, Xiaoqing Ye, and Xiaolin Zhang. Ssf-dan: Separated semantic feature based domain adaptation network for semantic segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 982–991, 2019. [17] Bo Fu, Zhangjie Cao, Jianmin Wang, and Mingsheng Long. Transferable query selection for active domain adaptation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7272–7281, 2021. [18] Yarin Gal and Zoubin Ghahramani. Dropout as a bayesian approximation: Representing model uncertainty in deep learning. In international conference on machine learning, pages 1050–1059, 2016. [19] Yarin Gal, Riashat Islam, and Zoubin Ghahramani. Deep bayesian active learning with image data. In International Conference on Machine Learning, pages 1183– 1192, 2017. [20] Timo Hackel, Nikolay Savinov, Lubor Ladicky, Jan D Wegner, Konrad Schindler, and Marc Pollefeys. Semantic3d. net: A new large-scale point cloud classification benchmark. arXiv preprint arXiv:1704.03847, 2017. [21] Trevor Hastie, Robert Tibshirani, Jerome H Friedman, and Jerome H Friedman. The elements of statistical learning: data mining, inference, and prediction, volume 2. Springer, 2009. [22] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016. [23] John R Hershey and Peder A Olsen. Approximating the kullback leibler diver- gence between gaussian mixture models. In 2007 IEEE International Conference on Acoustics, Speech and Signal Processing-ICASSP’07, volume 4, pages IV–317. IEEE, 2007. [24] Wei-Ning Hsu and Hsuan-Tien Lin. Active learning by learning. In Twenty-Ninth AAAI conference on artificial intelligence. Citeseer, 2015. [25] Tejaswi Kasarla, Gattigorla Nagendar, Guruprasad M Hegde, Vineeth Balasubramanian, and CV Jawahar. Region-based active learning for efficient labeling in semantic segmentation. In 2019 IEEE Winter Conference on Applications of Computer Vision (WACV), pages 1109–1117. IEEE, 2019. [26] Myeongjin Kim and Hyeran Byun. Learning texture invariant representation for domain adaptation of semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12975–12984, 2020. [27] Andreas Kirsch, Joost van Amersfoort, and Yarin Gal. Batchbald: Efficient and diverse batch acquisition for deep bayesian active learning. In Advances in Neural Information Processing Systems, pages 7026–7037, 2019. [28] Felix Järemo Lawin, Martin Danelljan, Patrik Tosteberg,Goutam Bhat,FahadShahbaz Khan, and Michael Felsberg. Deep projective 3d semantic segmentation. In International Conference on Computer Analysis of Images and Patterns, pages 95– 107. Springer, 2017. [29] Y Lin,GVosselman, Y Cao, and MY Yang. Efficienttrainingofsemanticpointcloud segmentation via active learning. ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, 2:243–250, 2020. [30] Yahao Liu,Jinhong Deng, Xinchen Gao, Wen Li, and Lixin Duan. Bapa-net:Boundary adaptation and prototype alignment for cross-domain semantic segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 8801–8811, 2021. [31] Zhijian Liu, Haotian Tang, Yujun Lin, and Song Han. Point-voxel cnn for efficient 3d deep learning. In Advances in Neural Information Processing Systems, pages 965–975, 2019. [32] Huan Luo, Cheng Wang, Chenglu Wen, Ziyi Chen, Dawei Zai, Yongtao Yu, and Jonathan Li. Semantic labeling of mobile lidar point clouds via active learning and higher order mrf. IEEE Transactions on Geoscience and Remote Sensing, 56(7):3631–3644, 2018. [33] Yawei Luo, Liang Zheng, Tao Guan, Junqing Yu, and Yi Yang. Taking a closer look at domain shift: Category-level adversaries for semantics consistent domain adaptation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2507–2516, 2019. [34] Radek Mackowiak, Philip Lenz, Omair Ghori, Ferran Diego, Oliver Lange, and Carsten Rother. CEREALS - cost-effective region-based active learning for semantic segmentation. In British Machine Vision Conference 2018, BMVC 2018, Newcastle, UK, September 3-6, 2018, page 121. BMVA Press, 2018. [35] Ke Mei, Chuang Zhu, Jiaqi Zou, and Shanghang Zhang. Instance adaptive self-training for unsupervised domain adaptation. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXVI 16, pages 415–430. Springer, 2020. [36] A. Tuan Nguyen, Toan Tran, Yarin Gal, Philip Torr, and Atilim Gunes Baydin. KL guided domain adaptation. In International Conference on Learning Representations, 2022. [37] Munan Ning, Donghuan Lu, Dong Wei, Cheng Bian, Chenglang Yuan, Shuang Yu, Kai Ma, and Yefeng Zheng. Multi-anchor active domain adaptation for semantic segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 9112–9122, 2021. [38] Yaniv Ovadia, Emily Fertig, Jie Ren, Zachary Nado, David Sculley, Sebastian Nowozin, Joshua Dillon, Balaji Lakshminarayanan, and Jasper Snoek. Can you trust your model’s uncertainty? evaluating predictive uncertainty under dataset shift. Advances in neural information processing systems, 32, 2019. [39] Jeremie Papon, Alexey Abramov, Markus Schoeler, and Florentin Worgotter. Voxel cloud connectivity segmentation-supervoxels for point clouds. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2027–2034, 2013. [40] Sujoy Paul, Yi-Hsuan Tsai, Samuel Schulter, Amit K Roy-Chowdhury, and Manmohan Chandraker. Domain adaptive semantic segmentation using weak labels. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part IX 16, pages 571–587. Springer, 2020. [41] Mark Pauly, Richard Keiser, and Markus Gross. Multi-scale feature extraction on point-sampled surfaces. In Computer graphics forum, volume 22, pages 281–289. Wiley Online Library, 2003. [42] Viraj Prabhu, Arjun Chandrasekaran, Kate Saenko, and Judy Hoffman. Active domain adaptation via clustering uncertainty-weighted embeddings. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 8505–8514, 2021. [43] Charles R Qi, Hao Su, Kaichun Mo, and Leonidas J Guibas. Pointnet: Deep learning on point sets for 3d classification and segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 652–660, 2017. [44] Charles Ruizhongtai Qi, Li Yi, Hao Su, and Leonidas J Guibas. Pointnet++: Deep hierarchical feature learning on point sets in a metric space. Advances in neural information processing systems, 30:5099–5108, 2017. [45] Stephan R Richter, Vibhav Vineet, Stefan Roth, and Vladlen Koltun. Playing for data: Ground truth from computer games. In European conference on computer vision, pages 102–118. Springer, 2016. [46] German Ros, Laura Sellart, Joanna Materzynska, David Vazquez, and Antonio M Lopez. The synthia dataset: A large collection of synthetic images for semantic segmentation of urban scenes. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3234–3243, 2016. [47] Dan Roth and Kevin Small. Margin-based active learning for structured output spaces. In European Conference on Machine Learning, pages 413–424. Springer, 2006. [48] Nicholas Roy and Andrew McCallum. Toward optimal active learning through monte carlo estimation of error reduction. ICML, Williamstown, pages 441–448, 2001. [49] Bryan C Russell, Antonio Torralba, Kevin P Murphy, and William T Freeman. Labelme: a database and web-based tool for image annotation. International journal of computer vision, 77(1-3):157–173, 2008. [50] Ozan Sener and Silvio Savarese. Active learning for convolutional neural networks: A core-set approach. In International Conference on Learning Representations, 2018. [51] Burr Settles. Active learning literature survey. Technical report, University of Wisconsin-Madison Department of Computer Sciences, 2009. [52] Claude E Shannon. A mathematical theory of communication. The Bell system technical journal, 27(3):379–423, 1948. [53] Inkyu Shin, Dong-Jin Kim, Jae Won Cho, Sanghyun Woo, Kwanyong Park, and In So Kweon. Labor: Labeling only if required for domain adaptive semantic segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 8588–8598, 2021. [54] Inkyu Shin, Sanghyun Woo, Fei Pan, and In So Kweon. Two-phase pseudo label densification for self-training based domain adaptation. In European conference on computer vision, pages 532–548. Springer, 2020. [55] Yawar Siddiqui, Julien Valentin, and Matthias Nießner. Viewal: Active learning with viewpoint entropy for semantic segmentation. In Proceedings of the IEEE/ CVF Conference on Computer Vision and Pattern Recognition, pages 9433–9443, 2020. [56] Anurag Singh, Naren Doraiswamy, Sawa Takamuku, Megh Bhalerao, Titir Dutta, Soma Biswas, Aditya Chepuri, Balasubramanian Vengatesan, and Naotake Natori. Improving semi-supervised domain adaptation using effective target selection and semantics. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2709–2718, 2021. [57] Jong-Chyi Su, Yi-Hsuan Tsai, Kihyuk Sohn, Buyu Liu, Subhransu Maji, and Manmohan Chandraker. Active adversarial domain adaptation. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 739–748, 2020. [58] HaotianTang,ZhijianLiu,ShengyuZhao,YujunLin,JiLin,HanruiWang,andSong Han. Searching efficient 3d architectures with sparse point-voxel convolution. In European Conference on Computer Vision, pages 685–702. Springer, 2020. [59] Hugues Thomas, Charles R Qi, Jean-Emmanuel Deschaud, Beatriz Marcotegui, François Goulette, and Leonidas J Guibas. Kpconv: Flexible and deformable convolution for point clouds. In Proceedings of the IEEE International Conference on Computer Vision, pages 6411–6420, 2019. [60] Yi-Hsuan Tsai, Wei-Chih Hung, Samuel Schulter, Kihyuk Sohn, Ming-Hsuan Yang, and Manmohan Chandraker. Learning to adapt structured output space for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 7472–7481, 2018. [61] Tuan-Hung Vu, Himalaya Jain, Maxime Bucher, Matthieu Cord, and Patrick Pérez. Advent: Adversarial entropy minimization for domain adaptation in semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2517–2526, 2019. [62] D. Wang and Y. Shang. A new active labeling method for deep learning. In 2014 International Joint Conference on Neural Networks (IJCNN), pages 112–119, 2014. [63] Keze Wang, Dongyu Zhang, Ya Li, Ruimao Zhang, and Liang Lin. Cost-effective active learning for deep image classification. IEEE Transactions on Circuits and Systems for Video Technology, 27(12):2591–2600, 2016. [64] Lei Wang, Yuchun Huang, Yaolin Hou, Shenman Zhang, and Jie Shan. Graph attention convolution for point cloud semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 10296–10305, 2019. [65] Yuxi Wang, Junran Peng, and ZhaoXiang Zhang. Uncertainty-aware pseudo label refinery for domain adaptive semantic segmentation. In Proceedings of the IEEE/ CVF International Conference on Computer Vision, pages 9092–9101, 2021. [66] Zhonghao Wang, Yunchao Wei, Rogerio Feris, Jinjun Xiong, Wen-Mei Hwu, Thomas S Huang, and Honghui Shi. Alleviating semantic-level shift: A semi-supervised domain adaptation method for semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pages 936–937, 2020. [67] ZhonghaoWang,MoYu,YunchaoWei,RogerioFeris,JinjunXiong,Wen-meiHwu, Thomas S Huang, and Honghui Shi. Differential treatment for stuff and things: A simple unsupervised domain adaptation method for semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12635–12644, 2020. [68] Jiacheng Wei, Guosheng Lin, Kim-Hui Yap, Tzu-Yi Hung, and Lihua Xie. Multi-path region mining for weakly supervised 3d semantic segmentation on point clouds. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recog- nition, pages 4384–4393, 2020. [69] BichenWu,AlvinWan,XiangyuYue,andKurtKeutzer.Squeezeseg:Convolutional neural nets with recurrent crf for real-time road-object segmentation from 3d lidar point cloud. In 2018 IEEE International Conference on Robotics and Automation (ICRA), pages 1887–1893. IEEE, 2018. [70] Bichen Wu, Xuanyu Zhou, Sicheng Zhao, Xiangyu Yue, and Kurt Keutzer. Squeezesegv2: Improved model structure and unsupervised domain adaptation for road-object segmentation from a lidar point cloud. In 2019 International Conference on Robotics and Automation (ICRA), pages 4376–4382. IEEE, 2019. [71] Tsung-Han Wu, Yueh-Cheng Liu, Yu-Kai Huang, Hsin-Ying Lee, Hung-Ting Su, Ping-Chia Huang, and Winston H Hsu. Redal: Region-based and diversity-aware active learning for point cloud semantic segmentation. In Proceedings of the IEEE/ CVF International Conference on Computer Vision, pages 15510–15519, 2021. [72] Binhui Xie, Longhui Yuan, Shuang Li, Chi Harold Liu, and Xinjing Cheng. Towards fewer annotations: Active learning via region impurity and prediction uncertainty for domain adaptive semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 8068–8078, 2022. [73] Binhui Xie, Longhui Yuan, Shuang Li, Chi Harold Liu, Xinjing Cheng, and Guoren Wang. Active learning for domain adaptation: An energy-based approach. In Thirty-Sixth AAAI Conference on Artificial Intelligence (AAAI-22), 2022. [74] Xun Xu and Gim Hee Lee. Weakly supervised semantic point cloud segmentation: Towards 10x fewer labels. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 13706–13715, 2020. [75] Yanchao Yang and Stefano Soatto. Fda: Fourier domain adaptation for semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4085–4095, 2020. [76] Pan Zhang, Bo Zhang, Ting Zhang, Dong Chen, Yong Wang, and Fang Wen. Prototypical pseudo label denoising and target structure learning for domain adaptive semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12414–12424, 2021. [77] Zhedong Zheng and Yi Yang. Rectifying pseudo label learning via uncertainty estimation for domain adaptive semantic segmentation. International Journal of Computer Vision, 129(4):1106–1120, 2021. [78] Fan Zhou, Changjian Shui, Shichun Yang, Bincheng Huang, Boyu Wang, and Brahim Chaibdraa. Discriminative active learning for domain adaptation. Knowledge-Based Systems, 222:106986, 2021. [79] Qianyu Zhou, Zhengyang Feng, Qiqi Gu, Jiangmiao Pang,Guangliang Cheng, Xuequan Lu, Jianping Shi, and Lizhuang Ma. Context-aware mixup for domain adaptive semantic segmentation. arXiv preprint arXiv:2108.03557, 2021. [80] YangZou,ZhidingYu,BVKKumar,andJinsongWang.Unsuperviseddomainadap- tation for semantic segmentation via class-balanced self-training. In Proceedings of the European conference on computer vision (ECCV), pages 289–305, 2018. [81] Yang Zou, Zhiding Yu, Xiaofeng Liu, BVK Kumar, and Jinsong Wang. Confidence regularized self-training. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 5982–5991, 2019. | |
| dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/85434 | - |
| dc.description.abstract | 雖然深度學習在監督語義分割方面取得了成功,獲得大規模的人工標註仍然具有挑戰性。在這種情況下,用少量訊息量大的標註資料最大化模型效能的主動 學習便能派上用場。在本論文中,我們分別提出了一個用於單域和跨域語義分割的通用主動學習框架。對於單域問題,我們設計了一種區域多樣性主動學習策略, 以最大限度地減少三維點雲語義分割的人工標註工作;針對領域自適應問題,我們提出了一種動態域密度主動域自適應方法,大幅減少了目標域標註的數量。大量實驗顯示,我們的方法高度優於以前的主動學習策略。此外,我們的方法可以 在多個常用數據集上以不到 15% 的標註達到完全監督學習的 90% 以上效能。 | zh_TW |
| dc.description.abstract | Despite the success of deep learning on supervised semantic segmentation, obtaining large-scale manual annotations is still challenging. Active learning, maximizing model performance with few informative labeled data, comes in handy for such a scenario. In this thesis, we present a general active learning framework for single-domain and cross-domain semantic segmentation, respectively. For single-domain problems, we design a regional diversity active learning strategy to minimize manual labeling effort for 3D point cloud semantic segmentation; for domain adaptation problems, we propose a dynamic domain density active domain adaptation method, greatly reducing the number of target domain annotations. Extensive experiments show that our method highly outperforms previous active learning strategies. Moreover, our method can achieve over 90% performance of fully supervised learning with less than 15% annotations on multiple commonly used datasets. | en |
| dc.description.provenance | Made available in DSpace on 2023-03-19T23:16:33Z (GMT). No. of bitstreams: 1 U0001-1307202210453400.pdf: 15307413 bytes, checksum: 2182dc3bad31209b2ee368013e8e53ea (MD5) Previous issue date: 2022 | en |
| dc.description.tableofcontents | Contents Page Acknowledgements i 摘要 ii Abstract iii Contents iv List of Figures viii List of Tables xiii Chapter1 Introduction 1 1.1 Thesis Overview and Main Ideas 2 Chapter2 Active Learning for Single Domain Segmentation 4 2.1 Background: Point Cloud Semantic Segmentation 4 2.2 Main Ideas: Region Diversity Active Learning 6 2.3 Related Works 8 2.3.1 Point Cloud Semantic Segmentation with less labeled data 8 2.3.2 Deep Active Learning 9 Chapter 3 Region-based and Diversity-aware Active Learning 11 3.1 Overview 11 3.2 Region Information Estimation 13 3.2.1 Softmax Entropy 14 3.2.2 Color Discontinuity 14 3.2.3 Structural Complexity 15 3.3 Diversity-aware Selection 15 3.3.1 Region Similarity Measurement 16 3.3.2 Similar Region Penalization 17 3.4 Region Label Acquisition 18 Chapter4 Experiment: Single Domain Active Learning 19 4.1 Experimental Settings 19 4.2 Active learning baseline comparison 20 4.3 Ablation Studies 24 Chapter5 Active Domain Adaptation for Segmentation 25 5.1 Background: Active Domain Adaptation 25 5.2 Main Ideas: Dymanic Domain Density Active Domain Adaptation 27 5.3 RelatedWork 29 5.3.1 Unsupervised Domain Adaptation for Semantic Segmentation 29 5.3.2 Domain Adaptation for Semantic Segmentation with Few Target Labels 30 5.3.3 Active Learning for Domain Adaptation 31 Chapter6 Dynamic Density-aware Active Domain Adaptation 32 6.1 Overview 32 6.2 Density-aware Selection 33 6.2.1 Domain Density Estimation 33 6.2.2 Density Differenceas Metric 34 6.2.3 Theoretical Foundation 35 6.2.4 Class-balanced Selection 37 6.3 Dynamic Scheduling Policy 38 Chapter7 Experiment: Active Domain Adaptation 40 7.1 Experimental Settings 40 7.2 Comparison with Active Learning Baselines 41 7.3 Comparison with Domain Adaptation Methods 44 7.4 Ablation Studies 45 Chapter8 47 8.1 Summary 47 8.2 Active Learning: Retrospect and Prospect 48 Reference 50 Appendix A -- Apppendix for Region Diversity Active Learning 63 A.1 Implementation Details 63 A.1.1 Network Training 63 A.1.2 Region Information Estimation 64 A.1.3 Diversity-aware Selection 65 A.2 Baseline ActiveLearning Methods 66 A.3 Experimental Result 68 Appendix B — Appendix for Dynamic Domain Density Active Learning 71 B.1 Proof 71 B.2 ImplementationDetails 71 B.2.1 UDA warm-up 72 B.2.2 Density-aware selection 72 B.2.3 Dynamic Scheduling Policy 73 B.2.4 Network Fine-tuning 73 B.3 Baseline Active Learning Methods 73 B.4 More Experimental Results and Analyses 77 B.4.1 Effectiveness of UDAWarm-up 77 B.4.2 Effectiveness of Initial Balance Coefficient 78 B.4.3 Effectiveness of Different Scheduling Policies 78 B.4.4 Influence of Inaccurately Predicted Categories 80 B.5 Qualitative Results 82 List of Figures 2.1 Human labeling efforts (colored areas) of different learning strategies. (a) In supervised training or traditional deep active learning, all points in a single point cloud are required to be labeled, which is labor- intensive. (b) Since few regions contribute to the model improvement, our region-based active learning strategy selects only a small portion of informative regions for label acquisition. Compared with case (a), our ap- proach greatly reduces the cost of semantic labeling of walls and floors. (c) Moreover, considering the redundant labeling where repeating visually similar regions in the same querying batch, we develop a diversity-aware selection algorithm to further reduce redundant labeling (e.g., ceiling colored in green in (b) and (c)) effort by penalizing visually similar regions. 5 2.2 Not all annotated regions contribute to the model’s improvement. This toy experiment compares the performance contribution of fully labeled (a) and partially (b, w/o floor) labeled scans on S3DIS [2] dataset. Specifically, the training dataset contains only 4 fully-labeled point cloud scans at the beginning. Another 4 fully or partially labeled scans are then added into the dataset at each following iteration. As shown in (c), compared to using all labels (solid line), removing floor labels (dash line) leads to similar performance on all classes including floor (blue), chairs (red), and bookcases (green). Additionally, (d) demonstrate that 12% of point annotation (21.7M fully labeled points versus 19.1M partially labeled points at 20 scans) is saved by simply removing the floor labels. Therefore, this shows that not all annotated regions contribute to the model’s improvement, and we can save the annotation costs by selecting key regions to annotate while maintaining the original performance. 6 3.1 Region-based and Diversity-Aware Active Learning Pipeline. In the proposed framework, a point cloud semantic segmentation model is first trained in supervision with labeled dataset DL. The model then produces softmax entropy and features of all regions from the unlabeled dataset DU. (a) Softmax entropy along with color discontinuity and structure complex- ity calculated from the unlabeled regions serves as selection indicators (Chap. 3.2), and (b) generates scores which are then adjusted by penalizing regions belonging to the same clusters grouped by the extracted features (Chap. 3.3). (c) The top-ranked regions are labeled by annotators and added to the labeled dataset DL for the next phase (Chap. 3.4). 12 3.2 Our method is able to find out visually similar regions not only in the same point cloud (a) but also in different point clouds (b). The areas colored in red are the ceiling in an auditorium (a) and walls next to the door (b). These regions may cause redundant labeling effort if appearing in the same querying batch, and thus they are filtered by our diversity- awareselection (Chap.3.3.1). 16 4.1 Experimental results of different active learning strategies on 2 datasets and 2 network architectures. We compare our region-based and diversity-aware active selection strategy with other existing baselines. It is obvious that our proposed method outperforms any existing active selection ap- proaches under any combinations. Furthermore, our method is able to reach 90% fully supervised result with only 15%, 5% labeled points on S3DIS [2] and SemanticKITTI [6] dataset respectively. 21 4.2 Visualization for the inference result on S3DIS dataset with SPVCNN network architecture. We show some inference examples on S3DIS Area 5 validation set. With our active learning strategy, the model can produce sharp boundaries (yellow bounding box in the first row) and rec- ognize small objects, such as boards and chairs (yellow bounding box in thesecondrow) with only 15% labeled points. 23 4.3 Visualization for the inference result on SemanticKITTI dataset with MinkowskiNet network architecture. We show some inference examples on the SemanticKITTI sequence 08 validation set. With our active learning strategy, the model can correctly recognize small vehicles (red bounding box in the first row) and identify people on the side walk (red bounding box in the second row) with merely 5 % labeled points. 23 4.4 Ablation Study. The best combinations are altering labeling units from scans to regions (+Region), applying diversity-aware selection (+Div), and additional region information (+Color/Structure). Best viewed in color. (Chap.4.3). 23 5.1 Different Exploration Techniques in the ADA. (a) [57, 78] proposed acquiring labels of target samples that are far from the source domain by the trained domain discriminator. However, the selected biased samples are inconsistent with real target distribution. (b) [17, 42, 56] proposed selecting diverse samples in the target domain with clustering techniques for label acquisition. Nonetheless, redundant annotations exist on samples that are similar to the existing labeled source domain dataset. (c) We propose a density-aware ADA strategy that acquires labels for samples that are representative in the target domain yet scarce the source domain, which is better than only considering either the source domain (a) or the target domain (b). 26 5.2 Analysis of Uncertainty Criterion in the ADA. Uncertainty-based active learning methods acquire labels of data close to the decision boundary (red background). Since its inability to detect high-confident error under severe domain shift, several representative samples in the target domain far from the decision boundary will not be selected (yellow background) in earlier rounds, resulting in low label efficiency under few labeling budgets as shown in (a, b, c). However, as the two domains gradually align by fine- tuning with acquired labels, the number of high-confident but erroneous regions are rapidly reduced. Meanwhile, uncertainty measurement is able to capture the low-confident error with an accurate model in later rounds, as shown in (d, e, f). Hence, we propose a dynamic scheduling policy to pay more attention on domain exploration in earlier stages and rely more on model uncertainty later(Chap.6.3). 28 7.1 Comparison with different active learning baselines. On two widely-used benchmarks, our D2ADA outperforms 8 existing active learning strate- gies, including uncertainty-based methods (MAR, CONF, ENT), hybrid approaches (ReDAL, BADGE), and ADA practices. The result suggested that compared to prior domain exploration methods shown in Figure 5.1 (a, b), our density-aware method (c) achieves significant improvement. More explanations and observations are reported in Chap. 7.2. 42 7.2 Comparison with various label-efficient domain adaptation methods. We compare our method with the state-of-the-art UDA method, ProDA [76] as well as various label-efficient approaches, including WDA [40], SSDA [11], ASS [66], MADA [37], LabOR [53] and EADA [73]. On two tasks and networks, our proposed D2ADA achieves the best result under the same number of labels. 44 7.3 The label distribution of our selected regions. Following [72], we draw histograms to compare the label distribution of the original target domain data (blue) and our 5% selected regions (red). The purple line chart shows the relative changes from the target dataset to our selection. Evidently, our class-balanced selection acquires more labels on minor classes, like 12x morelabelson “train”, “motor”, and “bike”. 46 A.1 Visualization of divided sub-scene regions in SemanticKITTI dataset. Points of the same color in neighboring places belong to the same region. 65 B.2 Qualitative results of different approaches for the GTA5 → Cityscapes domain adaptation task. We present three success cases (in the top three rows) and two failure cases (in the bottom two rows) of our method. For more detailed explanation, please refer to Sec. B.5. 83 List of Tables 1.1 Outline of the thesis and publication contributions of each chapter. 3 4.1 Results of IoU performance (%) on SemanticKITTI [6]. Under only 5% of annotated points, our proposed ReDAL outperforms random selection and is on par with full supervision (Full). 22 4.2 Labeled Class Distribution Ratio (‰). With limited annotation budgets, our active method ReDAL queries more labels on small objects like a person but less on large uniform areas like roads. The selection strategy can mitigate the label imbalance problem and improve the performance on more complicated object scenes without hurting much on large areas as shown in Table 4.1. 22 7.1 Comparison with different domain adaptation approaches on (a) GTA5 → Cityscapes and (b) SYNTHIA → Cityscapes. Our proposed D2ADA outperforms any existing methods on the overall mIoU and most per-class IoU with only 5% target annotations with different network backbones. mIoU* in (b) denotes the averaged scores across 13 categories used in [60]. To fairly compare all methods under the same number of annotations, we draw a chart in Figure 7.2. 43 7.2 Ablation studies. The three columns on the right show the model performance when 1%, 3%, and 5% target annotations are acquired by different active selection strategies. The results validate the effectiveness of our density-aware method, class-balanced selection, and dynamic scheduling policy as the discussion in Chap. 7.4. It also demonstrates uncertainty perform worse with low labeling budgets(1%, 3%) 45 A.1 Results of IoU performance (%) on S3DIS [2] with SPVCNN [58]. 69 A.2 Results of IoU performance (%) on S3DIS [2] with MinkowskiNet [13]. 69 A.3 Results of IoU performance (%) on SemanticKITTI [6] with SPVCNN [58]. 69 A.4 Results of IoU performance (%) on SemanticKITTI [6] with MinkowskiNet[13]. 69 A.5 Results of IoU performance (%) with only 5% labeled points. The table shows that our ReDAL achieve better results on most classes compared with baseline random selection. For some classes of small items and ob- jects with complex boundaries, our ReDAL greatly surpass the random selection baseline and even outperform fully supervised result, such as bicycle and bicyclist. 70 A.6 Labeled Class Distribution Ratio (‰). With limited annotation budgets, our active method ReDAL queries more labels on small objects like person and bicycle but less on large uniform areas like road and vegetation. The selection strategy can mitigate the label imbalance problem and improve the performance on more complicated object scenes without hurting much on large areas as shown in Table A.5. 70 B.7 mIoU scores of UDA warm-up [60] on the two tasks. 78 B.8 We report the mIoU scores with different balance coefficients α. We found that using only the uncertainty-based method, i.e., α = 0, obtained the worst results among all combinations. The results show that using some or all of the obtained annotations through density-aware selection can improve model performance. 79 B.9 We compare different label budget scheduling strategies on the GTA → Cityscapes task. The result shows that our designed half-decay method performs the best among all strategies. 80 B.10 Results of mIoU performance (%) on GTA [45] → Cityscapes [14] with DeepLabV3+ network backbone. 80 B.11 Results of 16-classes mIoU performance(%) on SYNTHIA [46] → Cityscapes [14] with DeepLabV3+ network backbone. 81 B.12 Complete experimental results of our proposed D2ADA on (a) GTA5 → Cityscapes and (b) SYNTHIA → Cityscapes with different percentage of acquired target labels 81 | |
| dc.language.iso | en | |
| dc.subject | 主動學習 | zh_TW |
| dc.subject | 領域自適應 | zh_TW |
| dc.subject | 三維點雲 | zh_TW |
| dc.subject | 語義分割 | zh_TW |
| dc.subject | 深度學習 | zh_TW |
| dc.subject | 3D Point Cloud | en |
| dc.subject | Deep Learning | en |
| dc.subject | Active Learning | en |
| dc.subject | Semantic Segmentation | en |
| dc.subject | Domain Adaptation | en |
| dc.title | 語義分割的主動學習: 區域多樣性和動態域密度 | zh_TW |
| dc.title | Active Learning for Semantic Segmentation: Region Diversity and Dynamic Domain Density | en |
| dc.type | Thesis | |
| dc.date.schoolyear | 110-2 | |
| dc.description.degree | 碩士 | |
| dc.contributor.author-orcid | 0000-0003-1087-840X | |
| dc.contributor.oralexamcommittee | 邱維辰(Wei-Chen Chiu),陳奕廷(Yi-Ting Chen),張哲瀚(Frank CH Chang) | |
| dc.subject.keyword | 深度學習,主動學習,語義分割,領域自適應,三維點雲, | zh_TW |
| dc.subject.keyword | Deep Learning,Active Learning,Semantic Segmentation,Domain Adaptation,3D Point Cloud, | en |
| dc.relation.page | 83 | |
| dc.identifier.doi | 10.6342/NTU202201440 | |
| dc.rights.note | 同意授權(全球公開) | |
| dc.date.accepted | 2022-07-19 | |
| dc.contributor.author-college | 電機資訊學院 | zh_TW |
| dc.contributor.author-dept | 資訊工程學研究所 | zh_TW |
| dc.date.embargo-lift | 2022-07-22 | - |
| 顯示於系所單位: | 資訊工程學系 | |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| U0001-1307202210453400.pdf | 14.95 MB | Adobe PDF | 檢視/開啟 |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
