請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/94323完整後設資料紀錄
| DC 欄位 | 值 | 語言 |
|---|---|---|
| dc.contributor.advisor | 徐宏民 | zh_TW |
| dc.contributor.advisor | Winston H. Hsu | en |
| dc.contributor.author | 陳靖元 | zh_TW |
| dc.contributor.author | Ching-Yuan Chen | en |
| dc.date.accessioned | 2024-08-15T16:49:01Z | - |
| dc.date.available | 2024-08-16 | - |
| dc.date.copyright | 2024-08-15 | - |
| dc.date.issued | 2024 | - |
| dc.date.submitted | 2024-08-01 | - |
| dc.identifier.citation | [1] J. T. Ash, C. Zhang, A. Krishnamurthy, J. Langford, and A. Agarwal. Deep batch active learning by diverse, uncertain gradient lower bounds, 2020.
[2] A. Chang, A. Dai, T. Funkhouser, M. Halber, M. Nießner, M. Savva, S. Song, A. Zeng, and Y. Zhang. Matterport3d: Learning from rgb-d data in indoor environments, 2017. [3] B. Chen, F. Xia, B. Ichter, K. Rao, K. Gopalakrishnan, M. S. Ryoo, A. Stone, and D. Kappler. Open-vocabulary queryable scene representations for real world planning. In 2023 IEEE International Conference on Robotics and Automation (ICRA), pages 11509–11522. IEEE, 2023. [4] J. Choi, I. Elezi, H.-J. Lee, C. Farabet, and J. M. Alvarez. Active learning for deep object detection via probabilistic modeling, 2021. [5] Y. Gal and Z. Ghahramani. Dropout as a bayesian approximation: Representing model uncertainty in deep learning, 2016. [6] G. Georgakis, B. Bucher, A. Arapin, K. Schmeckpeper, N. Matni, and K. Daniilidis. Uncertainty-driven planner for exploration and navigation, 2022 [7] G. Georgakis, B. Bucher, K. Schmeckpeper, S. Singh, and K. Daniilidis. Learning to map for active semantic goal navigation, 2022. [8] C. Huang, O. Mees, A. Zeng, and W. Burgard. Visual language maps for robot navigation. In 2023 IEEE International Conference on Robotics and Automation (ICRA), pages 10608–10615. IEEE, 2023. [9] W. Huang, C. Wang, R. Zhang, Y. Li, J. Wu, and L. Fei-Fei. Voxposer: Composable 3d value maps for robotic manipulation with language models. In 7th Annual Conference on Robot Learning, 2023. [10] K. M. Jatavallabhula, A. Kuwajerwala, Q. Gu, M. Omama, T. Chen, A. Maalouf, S. Li, G. S. Iyer, S. Saryazdi, N. V. Keetha, et al. Conceptfusion: Open-set multimodal 3d mapping. In ICRA2023 Workshop on Pretraining for Robotics (PT4R), 2023. [11] H. Jiang, B. Huang, R. Wu, Z. Li, S. Garg, H. Nayyeri, S. Wang, and Y. Li. RoboEXP: Action-conditioned scene graph via interactive exploration for robotic manipulation. In First Workshop on Vision-Language Models for Navigation and Manipulation at ICRA 2024, 2024. [12] A. Kendall and Y. Gal. What uncertainties do we need in bayesian deep learning for computer vision?, 2017. [13] A. Kirsch, J. van Amersfoort, and Y. Gal. Batchbald: Efficient and diverse batch acquisition for deep bayesian active learning, 2019. [14] B. Li, K. Q. Weinberger, S. Belongie, V. Koltun, and R. Ranftl. Language-driven semantic segmentation. In International Conference on Learning Representations, 2022. [15] L. H. Li, P. Zhang, H. Zhang, J. Yang, C. Li, Y. Zhong, L. Wang, L. Yuan, L. Zhang, J.-N. Hwang, K.-W. Chang, and J. Gao. Grounded language-image pretraining. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 10965–10975, 2022. [16] Y. Ovadia, E. Fertig, J. Ren, Z. Nado, D. Sculley, S. Nowozin, J. V. Dillon, B. Lak shminarayanan, and J. Snoek. Can you trust your model's uncertainty? evaluating predictive uncertainty under dataset shift, 2019. [17] X. Puig, E. Undersander, A. Szot, M. D. Cote, R. Partsey, J. Yang, R. Desai, A. W. Clegg, M. Hlavac, T. Min, T. Gervet, V. Vondruš, V.-P. Berges, J. Turner, O. Maksymets, Z. Kira, M. Kalakrishnan, J. Malik, D. S. Chaplot, U. Jain, D. Batra, A. Rai, and R. Mottaghi. Habitat 3.0: A co-habitat for humans, avatars and robots, 2023. [18] A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry, A. Askell, P. Mishkin, J. Clark, G. Krueger, and I. Sutskever. Learning transferable visual models from natural language supervision. In ICML, 2021. [19] M. Savva, A. Kadian, O. Maksymets, Y. Zhao, E. Wijmans, B. Jain, J. Straub, J. Liu, V. Koltun, J. Malik, D. Parikh, and D. Batra. Habitat: A Platform for Embodied AI Research. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2019. [20] B. Settles. Active learning literature survey. Computer Sciences Technical Report 1648, University of Wisconsin–Madison, 2009. [21] N. M. M. Shafiullah, C. Paxton, L. Pinto, S. Chintala, and A. Szlam. Clip-fields: Weakly supervised semantic fields for robotic memory. In ICRA2023 Workshop on Pretraining for Robotics (PT4R), 2023. [22] A. Szot, A. Clegg, E. Undersander, E. Wijmans, Y. Zhao, J. Turner, N. Maestre, M. Mukadam, D. Chaplot, O. Maksymets, A. Gokaslan, V. Vondrus, S. Dharur, F. Meier, W. Galuba, A. Chang, Z. Kira, V. Koltun, J. Malik, M. Savva, and D. Ba tra. Habitat 2.0: Training home assistants to rearrange their habitat. In Advances in Neural Information Processing Systems (NeurIPS), 2021. [23] A. Takmaz, E. Fedele, R. Sumner, M. Pollefeys, F. Tombari, and F. Engel mann. Openmask3d: Open-vocabulary 3d instance segmentation. In Thirty-seventh Conference on Neural Information Processing Systems, 2023. [24] K. Wang, X. Yan, D. Zhang, L. Zhang, and L. Lin. Towards human-machine cooperation: Self-supervised sample mining for object detection, 2018. [25] K. Wang, D. Zhang, Y. Li, R. Zhang, and L. Lin. Cost-effective active learning for deep image classification. IEEE Transactions on Circuits and Systems for Video Technology, 27(12):2591–2600, Dec. 2017. [26] J. Wu, J. Chen, and D. Huang. Entropy-based active learning for object detection with progressive diversity constraint, 2022. [27] Z. Ye, P. Liu, J. Liu, X. Tang, and W. Zhao. Practice makes perfect: An adaptive active learning framework for image classification. Neurocomputing, 196:95–106, 2016. | - |
| dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/94323 | - |
| dc.description.abstract | 在無需訓練的機器人應用中,以視覺語言模型(VLMs)探索構建的預探索語義地圖作為基礎元素已被證明非常有效。然而,現有的方法假設地圖是準確的,並且沒有提供有效的機制來根據錯誤的地圖修正決策。這篇論文引入了不確定概念,並通過不確定性來估計地圖的準確性和品質,使得機器人可以找到更有可能出錯的區域,並在沒有額外標籤的情況下修正由不準確地圖引起的錯誤決策。
我們使用兩個現代地圖的基石模型,VLMaps 和 OpenMask3D,展示了我們所提出方法的有效性,並在這兩者上都顯示了改善。 | zh_TW |
| dc.description.abstract | Pre-Explored Semantic Map, constructed through prior exploration using visual lan guage models (VLMs), has proven effective as a foundational element for training-free robotic applications. However, existing approaches assume the maps accuracy and do not provide effective mechanisms for revising decisions based on incorrect maps. This work introduces Uncertainty-Aware Memory Updating and Re-Proposing, which estimates the accuracy and quality of the map through uncertainty, enabling the agent to locate the areas with higher chance of being incorrect and to revise erroneous decisions stemming from inaccurate maps without additional labels.
We demonstrate the effectiveness of our proposed method using two modern map backbones, VLMaps and OpenMask3D, and show the improvements on both of them. | en |
| dc.description.provenance | Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2024-08-15T16:49:01Z No. of bitstreams: 0 | en |
| dc.description.provenance | Made available in DSpace on 2024-08-15T16:49:01Z (GMT). No. of bitstreams: 0 | en |
| dc.description.tableofcontents | Verification Letter from the Oral Examination Committee i
Acknowledgements iii 摘要 v Abstract vii Contents ix List of Figures xi List of Tables xiii Chapter 1 Introduction 1 Chapter 2 Related Work 5 2.1 Pre-Explored Semantic Map . . . . . . . . . . . . . . . . . . . . . . 5 2.2 Perception with Visual Language Model . . . . . . . . . . . . . . . . 6 2.3 Active Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Chapter 3 Method 9 3.1 Failure Case Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 9 3.2 Framework Overview . . . . . . . . . . . . . . . . . . . . . . . . . 10 3.3 Modules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 3.3.1 Efficient Map Update . . . . . . . . . . . . . . . . . . . . . . . . . 11 3.3.2 Re-Proposing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 3.4 Uncertainty Measures . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.4.1 Single-view: Entropy . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.4.2 Multi-view: Standard Error & KL-Divergence . . . . . . . . . . . . 15 Chapter 4 Experiments 17 4.1 Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 4.2 Main Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 4.3 Ablation Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 4.3.1 Efficacy of Each Module . . . . . . . . . . . . . . . . . . . . . . . 20 4.3.2 Efficient Map Update . . . . . . . . . . . . . . . . . . . . . . . . . 21 4.3.2.1 Selection of Uncertainty Measures . . . . . . . . . . . 21 4.3.2.2 Generalizability . . . . . . . . . . . . . . . . . . . . . 22 4.3.3 Re-proposing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 4.3.3.1 Selection of Confidence and Uncertainty Measures . . 22 4.3.3.2 Generalizability . . . . . . . . . . . . . . . . . . . . . 23 Chapter 5 Conclusion 25 References 27 Appendix A — More Method Details 31 A.1 Pseudocode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 A.2 Limitation of KL Divergence . . . . . . . . . . . . . . . . . . . . . . 32 | - |
| dc.language.iso | en | - |
| dc.subject | 預探索 | zh_TW |
| dc.subject | 不確定性 | zh_TW |
| dc.subject | 導航 | zh_TW |
| dc.subject | Embodied Agent Navigation | en |
| dc.subject | Uncertainty | en |
| dc.subject | Pre-Explore | en |
| dc.title | 基於不確定性之地圖更新與再查找 | zh_TW |
| dc.title | Uncertainty-Aware Memory Updating and Re-Proposing | en |
| dc.type | Thesis | - |
| dc.date.schoolyear | 112-2 | - |
| dc.description.degree | 碩士 | - |
| dc.contributor.oralexamcommittee | 陳尚澤;陳文進;葉梅珍 | zh_TW |
| dc.contributor.oralexamcommittee | Shang-Tse Chen;WC Chen;Mei-Chen Yeh | en |
| dc.subject.keyword | 不確定性,預探索,導航, | zh_TW |
| dc.subject.keyword | Uncertainty,Pre-Explore,Embodied Agent Navigation, | en |
| dc.relation.page | 32 | - |
| dc.identifier.doi | 10.6342/NTU202402241 | - |
| dc.rights.note | 同意授權(全球公開) | - |
| dc.date.accepted | 2024-08-04 | - |
| dc.contributor.author-college | 電機資訊學院 | - |
| dc.contributor.author-dept | 資訊工程學系 | - |
| 顯示於系所單位: | 資訊工程學系 | |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| ntu-112-2.pdf | 4.74 MB | Adobe PDF | 檢視/開啟 |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
