Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊網路與多媒體研究所
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/88101
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor陳文進zh_TW
dc.contributor.advisorWen-Chin Chenen
dc.contributor.author劉怡萱zh_TW
dc.contributor.authorYi-Syuan Liouen
dc.date.accessioned2023-08-08T16:18:17Z-
dc.date.available2023-11-09-
dc.date.copyright2023-08-08-
dc.date.issued2023-
dc.date.submitted2023-07-14-
dc.identifier.citation[1] J. T. Ash, C. Zhang, A. Krishnamurthy, J. Langford, and A. Agarwal. Deep batch active learning by diverse, uncertain gradient lower bounds. arXiv preprint arXiv:1906.03671, 2019.
[2] H. Bilen, M. Pedersoli, and T. Tuytelaars. Weakly supervised object detection with convex clustering. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1081–1089, 2015.
[3] H.Bilen and A.Vedaldi. Weakly supervised deep detection networks. InProceedings of the IEEE conference on computer vision and pattern recognition, pages 2846– 2854, 2016.
[4] J. Choi, I. Elezi, H.­J. Lee, C. Farabet, and J. M. Alvarez. Active learning for deep object detection via probabilistic modeling. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 10264–10273, 2021.
[5] M.Cordts, M.Omran, S.Ramos, T.Rehfeld, M.Enzweiler, R.Benenson, U.Franke, S. Roth, and B. Schiele. The cityscapes dataset for semantic urban scene under­ standing. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3213–3223, 2016.
[6] S. V. Desai and V. N. Balasubramanian. Towards fine­grained sampling for active learning in object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pages 924–925, 2020.
[7] A. Diba, V. Sharma, A. Pazandeh, H. Pirsiavash, and L. Van Gool. Weakly super­vised cascaded convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 914–922, 2017.
[8] Y. Gal and Z. Ghahramani. Dropout as a bayesian approximation: Representing model uncertainty in deep learning. In international conference on machine learning, pages 1050–1059. PMLR, 2016.
[9] Y.Gal, R.Islam, and Z.Ghahramani. Deep bayesian active learning with image data. In International Conference on Machine Learning, pages 1183–1192. PMLR, 2017.
[10] E.Haussmann, M.Fenzi, K.Chitta, J.Ivanecky, H.Xu, D.Roy, A.Mittel, N.Koum­ chatzky, C. Farabet, and J. M. Alvarez. Scalable active learning for object detection. In 2020 IEEE intelligent vehicles symposium (iv), pages 1430–1435. IEEE, 2020.
[11] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
[12] J. Jeong, S. Lee, J. Kim, and N. Kwak. Consistency-­based semi­-supervised learning for object detection. Advances in neural information processing systems, 32, 2019.
[13] Z. Jie, Y. Wei, X. Jin, J. Feng, and W. Liu. Deep self­-taught learning for weakly supervised object localization. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1377–1385, 2017.
[14] C.­C. Kao, T.­Y. Lee, P. Sen, and M.­Y. Liu. Localization-­aware active learning for object detection. In Asian Conference on Computer Vision, pages 506–522. Springer, 2018.
[15] A. Kirsch, J. Van Amersfoort, and Y. Gal. Batchbald: Efficient and diverse batch acquisition for deep bayesian active learning. Advances in neural information processing systems, 32, 2019.
[16] M. Laielli, G. Biamby, D. Chen, R. Gupta, A. Loeffler, P. D. Nguyen, R. Luo, T. Darrell, and S. Ebrahimi. Region­level active detector learning. arXiv preprint arXiv:2108.09186, 2021.
[17] D.Li, J.­B.Huang, Y.Li, S.Wang, and M.­H.Yang. Weakly supervised object local­ization with progressive domain adaptation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 3512–3520, 2016.
[18] Y. Li, D. Huang, D. Qin, L. Wang, and B. Gong. Improving object detection with selective self­-supervised self-­training. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXIX, pages 589–607. Springer, 2020.
[19] T.­Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollár. Focal loss for dense object detection. In Proceedings of the IEEE international conference on computer vision, pages 2980–2988, 2017.
[20] T.­Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, and C. L. Zitnick. Microsoft coco: Common objects in context. In Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6­12, 2014, Proceedings, Part V 13, pages 740–755. Springer, 2014.
[21] W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.­Y. Fu, and A. C. Berg. Ssd: Single shot multibox detector. In Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14, pages 21–37. Springer, 2016.
[22] Y.­C. Liu, C.­Y. Ma, Z. He, C.­W. Kuo, K. Chen, P. Zhang, B. Wu, Z. Kira, and P. Vajda. Unbiased teacher for semi­-supervised object detection. arXiv preprint arXiv:2102.09480, 2021.
[23] J. Redmon and A. Farhadi. Yolo9000: better, faster, stronger. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 7263–7271, 2017.
[24] S. Ren, K. He, R. Girshick, and J. Sun. Faster r-­cnn: Towards real­time object detection with region proposal networks. Advances in neural information processing systems, 28, 2015.
[25] O. Sener and S. Savarese. Active learning for convolutional neural networks: A core­-set approach. arXiv preprint arXiv:1708.00489, 2017.
[26] B. Singh, M. Najibi, and L. S. Davis. Sniper: Efficient multi­-scale training. Advances in neural information processing systems, 31, 2018.
[27] S. Sinha, S. Ebrahimi, and T. Darrell. Variational adversarial active learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 5972–5981, 2019.
[28] K. Sohn, Z. Zhang, C.­L. Li, H. Zhang, C.­Y. Lee, and T. Pfister. A sim­ple semi­-supervised learning framework for object detection. arXiv preprint arXiv:2005.04757, 2020.
[29] H. O. Song, Y. J. Lee, S. Jegelka, and T. Darrell. Weakly­-supervised discovery of visual pattern configurations. Advances in Neural Information Processing Systems, 27, 2014.
[30] P. Tang, C. Ramaiah, Y. Wang, R. Xu, and C. Xiong. Proposal learning for semi­-supervised object detection. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 2291–2301, 2021.
[31] P. Tang, X. Wang, S. Bai, W. Shen, X. Bai, W. Liu, and A. Yuille. Pcl: Proposal cluster learning for weakly supervised object detection. IEEE transactions on pattern analysis and machine intelligence, 42(1):176–191, 2018.
[32] H. V. Vo, O. Siméoni, S. Gidaris, A. Bursuc, P. Pérez, and J. Ponce. Active learning strategies for weakly-­supervised object detection. In Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXX, pages 211–230. Springer, 2022.
[33] D. Wang and Y. Shang. A new active labeling method for deep learning. In 2014 International joint conference on neural networks (IJCNN), pages 112–119. IEEE, 2014.
[34] K. Wang, X. Yan, D. Zhang, L. Zhang, and L. Lin. Towards human­-machine coop­eration: Self-supervised sample mining for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 1605–1613, 2018.
[35] K. Wang, D. Zhang, Y. Li, R. Zhang, and L. Lin. Cost­-effective active learning for deep image classification. IEEE Transactions on Circuits and Systems for Video Technology, 27(12):2591–2600, 2016.
[36] J. Wu, J. Chen, and D. Huang. Entropy­-based active learning for object detection with progressive diversity constraint. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9397–9406, 2022.
[37] M. Xu, Z. Zhang, H. Hu, J. Wang, L. Wang, F. Wei, X. Bai, and Z. Liu. End­-to-­end semi-supervised object detection with soft teacher. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 3060–3069, 2021.
[38] D. Yoo and I. S. Kweon. Learning loss for active learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 93–102, 2019.
[39] W. Yu, S. Zhu, T. Yang, and C. Chen. Consistency­-based active learning for object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3951–3960, 2022.
[40] T.Yuan, F.Wan, M.Fu, J.Liu, S.Xu, X.Ji, and Q.Ye. Multiple instance active learn­ing for object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5330–5339, 2021.
-
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/88101-
dc.description.abstract取得大規模標註的物體檢測資料集往往耗時且昂貴,因為需要對圖像進行邊界框和類別標籤的標註。為了減少成本,一些專門的主動學習方法被提出,可以從未標註的數據中選擇粗粒度樣本或細粒度實例進行標註。然而,前者的方法容易產生冗餘標註,而後者的方法通常會導致訓練的不穩定性和採樣偏差。為了應對這些挑戰,我們提出了一種名為多尺度基於區域的主動學習(MuRAL)的物體檢測方法。MuRAL通過識別不同尺度的信息區域,減少對已經學習良好的物體進行標註的成本,同時提高訓練性能。信息區域的得分設計考慮了實例的預測置信度和每個物體類別的分佈,使得我們的方法能夠更加關注難以檢測的類別。此外,MuRAL採用了一種尺度感知的選擇策略,確保從不同尺度選擇多樣化的區域進行標註和下游微調,從而增強訓練的穩定性。我們的方法在Cityscapes和MS COCO數據集上超越了所有現有的粗粒度和細粒度基準線,並在困難類別性能上實現了顯著改進。zh_TW
dc.description.abstractObtaining large-scale labeled object detection dataset can be costly and time-consuming, as it involves annotating images with bounding boxes and class labels. Thus, some specialized active learning methods have been proposed to reduce the cost by selecting either coarse-grained samples or fine-grained instances from unlabeled data for labeling. However, the former approaches suffer from redundant labeling, while the latter methods generally lead to training instability and sampling bias. To address these challenges, we propose a novel approach called Multi-scale Region-based Active Learning (MuRAL) for object detection. MuRAL identifies informative regions of various scales to reduce annotation costs for well-learned objects and improve training performance. The informative region score is designed to consider both the predicted confidence of instances and the distribution of each object category, enabling our method to focus more on difficult-to-detect classes. Moreover, MuRAL employs a scale-aware selection strategy that ensures diverse regions are selected from different scales for labeling and downstream finetuning, which enhances training stability. Our proposed method surpasses all existing coarse-grained and fine-grained baselines on Cityscapes and MS COCO datasets, and demonstrates significant improvement in difficult category performance.en
dc.description.provenanceSubmitted by admin ntu (admin@lib.ntu.edu.tw) on 2023-08-08T16:18:17Z
No. of bitstreams: 0
en
dc.description.provenanceMade available in DSpace on 2023-08-08T16:18:17Z (GMT). No. of bitstreams: 0en
dc.description.tableofcontentsVerification Letter from the Oral Examination Committee i
Acknowledgements iii
摘要 v
Abstract vii
Contents ix
List of Figures xi
List of Tables xiii
Chapter 1 Introduction 1
Chapter 2 Related Work 5
2.1 Object Detection with Label Efficiency ................ 5
2.2 Active Learning ............................ 6
Chapter 3 Problem Statement 9
Chapter 4 Method 11
4.1 MuRAL Overview ........................... 11
4.2 Multi­-scale Region Candidate Generation ............... 12
4.3 Informative Score Calculation ..................... 13
4.4 Scale­-aware Region Selection ..................... 16
4.5 Region Label Acquisition ....................... 16
Chapter 5 Experiments 19
5.1 Experimental Settings ......................... 19
5.2 Main Results .............................. 21
5.2.1 Comparison with Coarse­-grained Methods .............. 22
5.2.2 Comparison with Fine­ grained Methods ............... 23
5.3 Ablation Study ............................. 23
5.4 Case Study on Object Categories.................... 25
Chapter 6 Conclusion 27
References 29
Appendix A — Appendix for Multi­-scale Region­-based Active Learning 35
A.1 Implementation Details ..................... 35
A.2 Active Learning Baselines ....................... 36
A.2.1 Coarse­-grained Methods ....................... 36
A.2.2 Fine­-grained Methods......................... 37
A.3 Extensive Analyses and Results .................... 38
A.3.1 Experimental Results ......................... 38
A.3.2 Visualization ............................. 39
A.4 Limitations and Future Work...................... 39
-
dc.language.isoen-
dc.subject物體檢測zh_TW
dc.subject深度學習zh_TW
dc.subject主動學習zh_TW
dc.subject多尺度zh_TW
dc.subjectDeep Learningen
dc.subjectObject Detectionen
dc.subjectMulti-­scaleen
dc.subjectActive Learningen
dc.title基於區域和多尺度物體檢測的主動學習方法zh_TW
dc.titleMuRAL: Multi­-Scale Region­-based Active Learning for Object Detectionen
dc.typeThesis-
dc.date.schoolyear111-2-
dc.description.degree碩士-
dc.contributor.coadvisor徐宏民zh_TW
dc.contributor.coadvisorWinston H. Hsuen
dc.contributor.oralexamcommittee葉梅珍;陳奕廷;陳駿丞zh_TW
dc.contributor.oralexamcommitteeMei-Chen Yeh;Yi-Ting Chen;Jun-Cheng Chenen
dc.subject.keyword深度學習,主動學習,物體檢測,多尺度,zh_TW
dc.subject.keywordDeep Learning,Active Learning,Object Detection,Multi-­scale,en
dc.relation.page41-
dc.identifier.doi10.6342/NTU202301241-
dc.rights.note同意授權(全球公開)-
dc.date.accepted2023-07-14-
dc.contributor.author-college電機資訊學院-
dc.contributor.author-dept資訊網路與多媒體研究所-
顯示於系所單位:資訊網路與多媒體研究所

文件中的檔案:
檔案 大小格式 
ntu-111-2.pdf10.69 MBAdobe PDF檢視/開啟
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved