Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊工程學系
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/94323
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor徐宏民zh_TW
dc.contributor.advisorWinston H. Hsuen
dc.contributor.author陳靖元zh_TW
dc.contributor.authorChing-Yuan Chenen
dc.date.accessioned2024-08-15T16:49:01Z-
dc.date.available2024-08-16-
dc.date.copyright2024-08-15-
dc.date.issued2024-
dc.date.submitted2024-08-01-
dc.identifier.citation[1] J. T. Ash, C. Zhang, A. Krishnamurthy, J. Langford, and A. Agarwal. Deep batch active learning by diverse, uncertain gradient lower bounds, 2020.
[2] A. Chang, A. Dai, T. Funkhouser, M. Halber, M. Nießner, M. Savva, S. Song, A. Zeng, and Y. Zhang. Matterport3d: Learning from rgb-d data in indoor environments, 2017.
[3] B. Chen, F. Xia, B. Ichter, K. Rao, K. Gopalakrishnan, M. S. Ryoo, A. Stone, and D. Kappler. Open-vocabulary queryable scene representations for real world planning. In 2023 IEEE International Conference on Robotics and Automation (ICRA), pages 11509–11522. IEEE, 2023.
[4] J. Choi, I. Elezi, H.-J. Lee, C. Farabet, and J. M. Alvarez. Active learning for deep object detection via probabilistic modeling, 2021.
[5] Y. Gal and Z. Ghahramani. Dropout as a bayesian approximation: Representing model uncertainty in deep learning, 2016.
[6] G. Georgakis, B. Bucher, A. Arapin, K. Schmeckpeper, N. Matni, and K. Daniilidis. Uncertainty-driven planner for exploration and navigation, 2022
[7] G. Georgakis, B. Bucher, K. Schmeckpeper, S. Singh, and K. Daniilidis. Learning to map for active semantic goal navigation, 2022.
[8] C. Huang, O. Mees, A. Zeng, and W. Burgard. Visual language maps for robot navigation. In 2023 IEEE International Conference on Robotics and Automation (ICRA), pages 10608–10615. IEEE, 2023.
[9] W. Huang, C. Wang, R. Zhang, Y. Li, J. Wu, and L. Fei-Fei. Voxposer: Composable 3d value maps for robotic manipulation with language models. In 7th Annual Conference on Robot Learning, 2023.
[10] K. M. Jatavallabhula, A. Kuwajerwala, Q. Gu, M. Omama, T. Chen, A. Maalouf, S. Li, G. S. Iyer, S. Saryazdi, N. V. Keetha, et al. Conceptfusion: Open-set multimodal 3d mapping. In ICRA2023 Workshop on Pretraining for Robotics (PT4R), 2023.
[11] H. Jiang, B. Huang, R. Wu, Z. Li, S. Garg, H. Nayyeri, S. Wang, and Y. Li. RoboEXP: Action-conditioned scene graph via interactive exploration for robotic manipulation. In First Workshop on Vision-Language Models for Navigation and Manipulation at ICRA 2024, 2024.
[12] A. Kendall and Y. Gal. What uncertainties do we need in bayesian deep learning for computer vision?, 2017.
[13] A. Kirsch, J. van Amersfoort, and Y. Gal. Batchbald: Efficient and diverse batch acquisition for deep bayesian active learning, 2019.
[14] B. Li, K. Q. Weinberger, S. Belongie, V. Koltun, and R. Ranftl. Language-driven semantic segmentation. In International Conference on Learning Representations, 2022.
[15] L. H. Li, P. Zhang, H. Zhang, J. Yang, C. Li, Y. Zhong, L. Wang, L. Yuan, L. Zhang, J.-N. Hwang, K.-W. Chang, and J. Gao. Grounded language-image pretraining. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 10965–10975, 2022.
[16] Y. Ovadia, E. Fertig, J. Ren, Z. Nado, D. Sculley, S. Nowozin, J. V. Dillon, B. Lak shminarayanan, and J. Snoek. Can you trust your model's uncertainty? evaluating predictive uncertainty under dataset shift, 2019.
[17] X. Puig, E. Undersander, A. Szot, M. D. Cote, R. Partsey, J. Yang, R. Desai, A. W. Clegg, M. Hlavac, T. Min, T. Gervet, V. Vondruš, V.-P. Berges, J. Turner, O. Maksymets, Z. Kira, M. Kalakrishnan, J. Malik, D. S. Chaplot, U. Jain, D. Batra, A. Rai, and R. Mottaghi. Habitat 3.0: A co-habitat for humans, avatars and robots, 2023.
[18] A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry, A. Askell, P. Mishkin, J. Clark, G. Krueger, and I. Sutskever. Learning transferable visual models from natural language supervision. In ICML, 2021.
[19] M. Savva, A. Kadian, O. Maksymets, Y. Zhao, E. Wijmans, B. Jain, J. Straub, J. Liu, V. Koltun, J. Malik, D. Parikh, and D. Batra. Habitat: A Platform for Embodied AI Research. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2019.
[20] B. Settles. Active learning literature survey. Computer Sciences Technical Report 1648, University of Wisconsin–Madison, 2009.
[21] N. M. M. Shafiullah, C. Paxton, L. Pinto, S. Chintala, and A. Szlam. Clip-fields: Weakly supervised semantic fields for robotic memory. In ICRA2023 Workshop on Pretraining for Robotics (PT4R), 2023.
[22] A. Szot, A. Clegg, E. Undersander, E. Wijmans, Y. Zhao, J. Turner, N. Maestre, M. Mukadam, D. Chaplot, O. Maksymets, A. Gokaslan, V. Vondrus, S. Dharur, F. Meier, W. Galuba, A. Chang, Z. Kira, V. Koltun, J. Malik, M. Savva, and D. Ba tra. Habitat 2.0: Training home assistants to rearrange their habitat. In Advances in Neural Information Processing Systems (NeurIPS), 2021.
[23] A. Takmaz, E. Fedele, R. Sumner, M. Pollefeys, F. Tombari, and F. Engel mann. Openmask3d: Open-vocabulary 3d instance segmentation. In Thirty-seventh Conference on Neural Information Processing Systems, 2023.
[24] K. Wang, X. Yan, D. Zhang, L. Zhang, and L. Lin. Towards human-machine cooperation: Self-supervised sample mining for object detection, 2018.
[25] K. Wang, D. Zhang, Y. Li, R. Zhang, and L. Lin. Cost-effective active learning for deep image classification. IEEE Transactions on Circuits and Systems for Video Technology, 27(12):2591–2600, Dec. 2017.
[26] J. Wu, J. Chen, and D. Huang. Entropy-based active learning for object detection with progressive diversity constraint, 2022.
[27] Z. Ye, P. Liu, J. Liu, X. Tang, and W. Zhao. Practice makes perfect: An adaptive active learning framework for image classification. Neurocomputing, 196:95–106, 2016.
-
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/94323-
dc.description.abstract在無需訓練的機器人應用中,以視覺語言模型(VLMs)探索構建的預探索語義地圖作為基礎元素已被證明非常有效。然而,現有的方法假設地圖是準確的,並且沒有提供有效的機制來根據錯誤的地圖修正決策。這篇論文引入了不確定概念,並通過不確定性來估計地圖的準確性和品質,使得機器人可以找到更有可能出錯的區域,並在沒有額外標籤的情況下修正由不準確地圖引起的錯誤決策。
我們使用兩個現代地圖的基石模型,VLMaps 和 OpenMask3D,展示了我們所提出方法的有效性,並在這兩者上都顯示了改善。
zh_TW
dc.description.abstractPre-Explored Semantic Map, constructed through prior exploration using visual lan guage models (VLMs), has proven effective as a foundational element for training-free robotic applications. However, existing approaches assume the maps accuracy and do not provide effective mechanisms for revising decisions based on incorrect maps. This work introduces Uncertainty-Aware Memory Updating and Re-Proposing, which estimates the accuracy and quality of the map through uncertainty, enabling the agent to locate the areas with higher chance of being incorrect and to revise erroneous decisions stemming from inaccurate maps without additional labels.
We demonstrate the effectiveness of our proposed method using two modern map backbones, VLMaps and OpenMask3D, and show the improvements on both of them.
en
dc.description.provenanceSubmitted by admin ntu (admin@lib.ntu.edu.tw) on 2024-08-15T16:49:01Z
No. of bitstreams: 0
en
dc.description.provenanceMade available in DSpace on 2024-08-15T16:49:01Z (GMT). No. of bitstreams: 0en
dc.description.tableofcontentsVerification Letter from the Oral Examination Committee i
Acknowledgements iii
摘要 v
Abstract vii
Contents ix
List of Figures xi
List of Tables xiii
Chapter 1 Introduction 1
Chapter 2 Related Work 5
2.1 Pre-Explored Semantic Map . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Perception with Visual Language Model . . . . . . . . . . . . . . . . 6
2.3 Active Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Chapter 3 Method 9
3.1 Failure Case Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.2 Framework Overview . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.3 Modules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.3.1 Efficient Map Update . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.3.2 Re-Proposing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.4 Uncertainty Measures . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.4.1 Single-view: Entropy . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.4.2 Multi-view: Standard Error & KL-Divergence . . . . . . . . . . . . 15
Chapter 4 Experiments 17
4.1 Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.2 Main Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4.3 Ablation Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.3.1 Efficacy of Each Module . . . . . . . . . . . . . . . . . . . . . . . 20
4.3.2 Efficient Map Update . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.3.2.1 Selection of Uncertainty Measures . . . . . . . . . . . 21
4.3.2.2 Generalizability . . . . . . . . . . . . . . . . . . . . . 22
4.3.3 Re-proposing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4.3.3.1 Selection of Confidence and Uncertainty Measures . . 22
4.3.3.2 Generalizability . . . . . . . . . . . . . . . . . . . . . 23
Chapter 5 Conclusion 25
References 27
Appendix A — More Method Details 31
A.1 Pseudocode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
A.2 Limitation of KL Divergence . . . . . . . . . . . . . . . . . . . . . . 32
-
dc.language.isoen-
dc.subject預探索zh_TW
dc.subject不確定性zh_TW
dc.subject導航zh_TW
dc.subjectEmbodied Agent Navigationen
dc.subjectUncertaintyen
dc.subjectPre-Exploreen
dc.title基於不確定性之地圖更新與再查找zh_TW
dc.titleUncertainty-Aware Memory Updating and Re-Proposingen
dc.typeThesis-
dc.date.schoolyear112-2-
dc.description.degree碩士-
dc.contributor.oralexamcommittee陳尚澤;陳文進;葉梅珍zh_TW
dc.contributor.oralexamcommitteeShang-Tse Chen;WC Chen;Mei-Chen Yehen
dc.subject.keyword不確定性,預探索,導航,zh_TW
dc.subject.keywordUncertainty,Pre-Explore,Embodied Agent Navigation,en
dc.relation.page32-
dc.identifier.doi10.6342/NTU202402241-
dc.rights.note同意授權(全球公開)-
dc.date.accepted2024-08-04-
dc.contributor.author-college電機資訊學院-
dc.contributor.author-dept資訊工程學系-
顯示於系所單位:資訊工程學系

文件中的檔案:
檔案 大小格式 
ntu-112-2.pdf4.74 MBAdobe PDF檢視/開啟
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved