基於區域和多尺度物體檢測的主動學習方法

劉怡萱; Yi-Syuan Liou

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/88101

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	陳文進	zh_TW
dc.contributor.advisor	Wen-Chin Chen	en
dc.contributor.author	劉怡萱	zh_TW
dc.contributor.author	Yi-Syuan Liou	en
dc.date.accessioned	2023-08-08T16:18:17Z	-
dc.date.available	2023-11-09	-
dc.date.copyright	2023-08-08	-
dc.date.issued	2023	-
dc.date.submitted	2023-07-14	-
dc.identifier.citation	[1] J. T. Ash, C. Zhang, A. Krishnamurthy, J. Langford, and A. Agarwal. Deep batch active learning by diverse, uncertain gradient lower bounds. arXiv preprint arXiv:1906.03671, 2019. [2] H. Bilen, M. Pedersoli, and T. Tuytelaars. Weakly supervised object detection with convex clustering. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1081–1089, 2015. [3] H.Bilen and A.Vedaldi. Weakly supervised deep detection networks. InProceedings of the IEEE conference on computer vision and pattern recognition, pages 2846– 2854, 2016. [4] J. Choi, I. Elezi, H.J. Lee, C. Farabet, and J. M. Alvarez. Active learning for deep object detection via probabilistic modeling. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 10264–10273, 2021. [5] M.Cordts, M.Omran, S.Ramos, T.Rehfeld, M.Enzweiler, R.Benenson, U.Franke, S. Roth, and B. Schiele. The cityscapes dataset for semantic urban scene under standing. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3213–3223, 2016. [6] S. V. Desai and V. N. Balasubramanian. Towards finegrained sampling for active learning in object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pages 924–925, 2020. [7] A. Diba, V. Sharma, A. Pazandeh, H. Pirsiavash, and L. Van Gool. Weakly supervised cascaded convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 914–922, 2017. [8] Y. Gal and Z. Ghahramani. Dropout as a bayesian approximation: Representing model uncertainty in deep learning. In international conference on machine learning, pages 1050–1059. PMLR, 2016. [9] Y.Gal, R.Islam, and Z.Ghahramani. Deep bayesian active learning with image data. In International Conference on Machine Learning, pages 1183–1192. PMLR, 2017. [10] E.Haussmann, M.Fenzi, K.Chitta, J.Ivanecky, H.Xu, D.Roy, A.Mittel, N.Koum chatzky, C. Farabet, and J. M. Alvarez. Scalable active learning for object detection. In 2020 IEEE intelligent vehicles symposium (iv), pages 1430–1435. IEEE, 2020. [11] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016. [12] J. Jeong, S. Lee, J. Kim, and N. Kwak. Consistency-based semi-supervised learning for object detection. Advances in neural information processing systems, 32, 2019. [13] Z. Jie, Y. Wei, X. Jin, J. Feng, and W. Liu. Deep self-taught learning for weakly supervised object localization. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1377–1385, 2017. [14] C.C. Kao, T.Y. Lee, P. Sen, and M.Y. Liu. Localization-aware active learning for object detection. In Asian Conference on Computer Vision, pages 506–522. Springer, 2018. [15] A. Kirsch, J. Van Amersfoort, and Y. Gal. Batchbald: Efficient and diverse batch acquisition for deep bayesian active learning. Advances in neural information processing systems, 32, 2019. [16] M. Laielli, G. Biamby, D. Chen, R. Gupta, A. Loeffler, P. D. Nguyen, R. Luo, T. Darrell, and S. Ebrahimi. Regionlevel active detector learning. arXiv preprint arXiv:2108.09186, 2021. [17] D.Li, J.B.Huang, Y.Li, S.Wang, and M.H.Yang. Weakly supervised object localization with progressive domain adaptation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 3512–3520, 2016. [18] Y. Li, D. Huang, D. Qin, L. Wang, and B. Gong. Improving object detection with selective self-supervised self-training. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXIX, pages 589–607. Springer, 2020. [19] T.Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollár. Focal loss for dense object detection. In Proceedings of the IEEE international conference on computer vision, pages 2980–2988, 2017. [20] T.Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, and C. L. Zitnick. Microsoft coco: Common objects in context. In Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 612, 2014, Proceedings, Part V 13, pages 740–755. Springer, 2014. [21] W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.Y. Fu, and A. C. Berg. Ssd: Single shot multibox detector. In Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14, pages 21–37. Springer, 2016. [22] Y.C. Liu, C.Y. Ma, Z. He, C.W. Kuo, K. Chen, P. Zhang, B. Wu, Z. Kira, and P. Vajda. Unbiased teacher for semi-supervised object detection. arXiv preprint arXiv:2102.09480, 2021. [23] J. Redmon and A. Farhadi. Yolo9000: better, faster, stronger. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 7263–7271, 2017. [24] S. Ren, K. He, R. Girshick, and J. Sun. Faster r-cnn: Towards realtime object detection with region proposal networks. Advances in neural information processing systems, 28, 2015. [25] O. Sener and S. Savarese. Active learning for convolutional neural networks: A core-set approach. arXiv preprint arXiv:1708.00489, 2017. [26] B. Singh, M. Najibi, and L. S. Davis. Sniper: Efficient multi-scale training. Advances in neural information processing systems, 31, 2018. [27] S. Sinha, S. Ebrahimi, and T. Darrell. Variational adversarial active learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 5972–5981, 2019. [28] K. Sohn, Z. Zhang, C.L. Li, H. Zhang, C.Y. Lee, and T. Pfister. A simple semi-supervised learning framework for object detection. arXiv preprint arXiv:2005.04757, 2020. [29] H. O. Song, Y. J. Lee, S. Jegelka, and T. Darrell. Weakly-supervised discovery of visual pattern configurations. Advances in Neural Information Processing Systems, 27, 2014. [30] P. Tang, C. Ramaiah, Y. Wang, R. Xu, and C. Xiong. Proposal learning for semi-supervised object detection. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 2291–2301, 2021. [31] P. Tang, X. Wang, S. Bai, W. Shen, X. Bai, W. Liu, and A. Yuille. Pcl: Proposal cluster learning for weakly supervised object detection. IEEE transactions on pattern analysis and machine intelligence, 42(1):176–191, 2018. [32] H. V. Vo, O. Siméoni, S. Gidaris, A. Bursuc, P. Pérez, and J. Ponce. Active learning strategies for weakly-supervised object detection. In Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXX, pages 211–230. Springer, 2022. [33] D. Wang and Y. Shang. A new active labeling method for deep learning. In 2014 International joint conference on neural networks (IJCNN), pages 112–119. IEEE, 2014. [34] K. Wang, X. Yan, D. Zhang, L. Zhang, and L. Lin. Towards human-machine cooperation: Self-supervised sample mining for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 1605–1613, 2018. [35] K. Wang, D. Zhang, Y. Li, R. Zhang, and L. Lin. Cost-effective active learning for deep image classification. IEEE Transactions on Circuits and Systems for Video Technology, 27(12):2591–2600, 2016. [36] J. Wu, J. Chen, and D. Huang. Entropy-based active learning for object detection with progressive diversity constraint. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9397–9406, 2022. [37] M. Xu, Z. Zhang, H. Hu, J. Wang, L. Wang, F. Wei, X. Bai, and Z. Liu. End-to-end semi-supervised object detection with soft teacher. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 3060–3069, 2021. [38] D. Yoo and I. S. Kweon. Learning loss for active learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 93–102, 2019. [39] W. Yu, S. Zhu, T. Yang, and C. Chen. Consistency-based active learning for object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3951–3960, 2022. [40] T.Yuan, F.Wan, M.Fu, J.Liu, S.Xu, X.Ji, and Q.Ye. Multiple instance active learning for object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5330–5339, 2021.	-
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/88101	-
dc.description.abstract	取得大規模標註的物體檢測資料集往往耗時且昂貴，因為需要對圖像進行邊界框和類別標籤的標註。為了減少成本，一些專門的主動學習方法被提出，可以從未標註的數據中選擇粗粒度樣本或細粒度實例進行標註。然而，前者的方法容易產生冗餘標註，而後者的方法通常會導致訓練的不穩定性和採樣偏差。為了應對這些挑戰，我們提出了一種名為多尺度基於區域的主動學習（MuRAL）的物體檢測方法。MuRAL通過識別不同尺度的信息區域，減少對已經學習良好的物體進行標註的成本，同時提高訓練性能。信息區域的得分設計考慮了實例的預測置信度和每個物體類別的分佈，使得我們的方法能夠更加關注難以檢測的類別。此外，MuRAL採用了一種尺度感知的選擇策略，確保從不同尺度選擇多樣化的區域進行標註和下游微調，從而增強訓練的穩定性。我們的方法在Cityscapes和MS COCO數據集上超越了所有現有的粗粒度和細粒度基準線，並在困難類別性能上實現了顯著改進。	zh_TW
dc.description.abstract	Obtaining large-scale labeled object detection dataset can be costly and time-consuming, as it involves annotating images with bounding boxes and class labels. Thus, some specialized active learning methods have been proposed to reduce the cost by selecting either coarse-grained samples or fine-grained instances from unlabeled data for labeling. However, the former approaches suffer from redundant labeling, while the latter methods generally lead to training instability and sampling bias. To address these challenges, we propose a novel approach called Multi-scale Region-based Active Learning (MuRAL) for object detection. MuRAL identifies informative regions of various scales to reduce annotation costs for well-learned objects and improve training performance. The informative region score is designed to consider both the predicted confidence of instances and the distribution of each object category, enabling our method to focus more on difficult-to-detect classes. Moreover, MuRAL employs a scale-aware selection strategy that ensures diverse regions are selected from different scales for labeling and downstream finetuning, which enhances training stability. Our proposed method surpasses all existing coarse-grained and fine-grained baselines on Cityscapes and MS COCO datasets, and demonstrates significant improvement in difficult category performance.	en
dc.description.provenance	Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2023-08-08T16:18:17Z No. of bitstreams: 0	en
dc.description.provenance	Made available in DSpace on 2023-08-08T16:18:17Z (GMT). No. of bitstreams: 0	en
dc.description.tableofcontents	Verification Letter from the Oral Examination Committee i Acknowledgements iii 摘要 v Abstract vii Contents ix List of Figures xi List of Tables xiii Chapter 1 Introduction 1 Chapter 2 Related Work 5 2.1 Object Detection with Label Efficiency ................ 5 2.2 Active Learning ............................ 6 Chapter 3 Problem Statement 9 Chapter 4 Method 11 4.1 MuRAL Overview ........................... 11 4.2 Multi-scale Region Candidate Generation ............... 12 4.3 Informative Score Calculation ..................... 13 4.4 Scale-aware Region Selection ..................... 16 4.5 Region Label Acquisition ....................... 16 Chapter 5 Experiments 19 5.1 Experimental Settings ......................... 19 5.2 Main Results .............................. 21 5.2.1 Comparison with Coarse-grained Methods .............. 22 5.2.2 Comparison with Fine grained Methods ............... 23 5.3 Ablation Study ............................. 23 5.4 Case Study on Object Categories.................... 25 Chapter 6 Conclusion 27 References 29 Appendix A — Appendix for Multi-scale Region-based Active Learning 35 A.1 Implementation Details ..................... 35 A.2 Active Learning Baselines ....................... 36 A.2.1 Coarse-grained Methods ....................... 36 A.2.2 Fine-grained Methods......................... 37 A.3 Extensive Analyses and Results .................... 38 A.3.1 Experimental Results ......................... 38 A.3.2 Visualization ............................. 39 A.4 Limitations and Future Work...................... 39	-
dc.language.iso	en	-
dc.subject	物體檢測	zh_TW
dc.subject	深度學習	zh_TW
dc.subject	主動學習	zh_TW
dc.subject	多尺度	zh_TW
dc.subject	Deep Learning	en
dc.subject	Object Detection	en
dc.subject	Multi-scale	en
dc.subject	Active Learning	en
dc.title	基於區域和多尺度物體檢測的主動學習方法	zh_TW
dc.title	MuRAL: Multi-Scale Region-based Active Learning for Object Detection	en
dc.type	Thesis	-
dc.date.schoolyear	111-2	-
dc.description.degree	碩士	-
dc.contributor.coadvisor	徐宏民	zh_TW
dc.contributor.coadvisor	Winston H. Hsu	en
dc.contributor.oralexamcommittee	葉梅珍;陳奕廷;陳駿丞	zh_TW
dc.contributor.oralexamcommittee	Mei-Chen Yeh;Yi-Ting Chen;Jun-Cheng Chen	en
dc.subject.keyword	深度學習,主動學習,物體檢測,多尺度,	zh_TW
dc.subject.keyword	Deep Learning,Active Learning,Object Detection,Multi-scale,	en
dc.relation.page	41	-
dc.identifier.doi	10.6342/NTU202301241	-
dc.rights.note	同意授權(全球公開)	-
dc.date.accepted	2023-07-14	-
dc.contributor.author-college	電機資訊學院	-
dc.contributor.author-dept	資訊網路與多媒體研究所	-
顯示於系所單位：	資訊網路與多媒體研究所

文件中的檔案：

檔案	大小	格式
ntu-111-2.pdf	10.69 MB	Adobe PDF	檢視/開啟

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。