基於 RGB 圖像中新物體的 6D 姿態估計

洪彥揚; Yen-Yang Hung

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/95036

標題:	基於 RGB 圖像中新物體的 6D 姿態估計 6D Pose Estimation of Novel Objects Based on RGB Images
作者:	洪彥揚 Yen-Yang Hung
指導教授:	陳祝嵩 Chu-Song Chen
關鍵字:	6D 姿態估計,單張 RGB 影像,可微渲染,2D-3D 特徵點匹配, 6DoF pose estimation,Single RGB image,Differentiable rendering,2D-3D keypoint matching,
出版年 :	2024
學位:	碩士
摘要:	我們提出了一種新方法，能夠從單張RGB圖像準確估計新物體的6自由度（6DoF）姿態。我們的方法巧妙地結合了2D-3D關鍵點的對應和透過渲染比較來優化姿態。具體來說，我們首先使用現有的物體檢測技術檢測輸入圖像中的目標物體，然後通過2D-3D關鍵點匹配來估計初始的6DoF姿態。最後，我們利用3D高斯渲染技術，通過比較渲染圖像與輸入圖像來精細化優化物體的姿態。我們的方法結合了基於點雲模型的2D-3D關鍵點對應和基於3D高斯點的渲染模型，並實現了高效的可微渲染技術。實驗結果顯示，我們的方法在LINEMOD、YCB-V和OnePose-LowTexture等數據集上表現出色，尤其適用於實景和室內場景中的應用。 We introduce a new method for accurately estimating the 6DoF pose of new objects from a single RGB image. Our approach cleverly integrates 2D-3D keypoint correspondences and utilizes rendering comparisons to optimize the pose. Specifically, we first employ existing object detection techniques to detect the target object in the input image. Next, we estimate the initial 6DoF pose using 2D-3D keypoint matching. Finally, we refine the object's pose using 3D Gaussian rendering techniques by comparing rendered images with the input image. Our method combines 2D-3D keypoint correspondences based on point cloud models and utilizes 3D Gaussian rendering models, implementing efficient differentiable rendering techniques. Experimental results demonstrate the effectiveness of our approach on datasets such as LINEMOD, YCB-V, and OnePose-LowTexture, particularly in real-world and indoor settings.
URI:	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/95036
DOI:	10.6342/NTU202403279
全文授權:	未授權
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
ntu-112-2.pdf 未授權公開取用	24.9 MB	Adobe PDF

顯示文件完整紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料（如：文字、圖片、PDF）並使其易於取用。