攝影機定位方法與應用

劉郁昌; Yo-Chung Lau

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/101567

標題:	攝影機定位方法與應用 Methodology and Application of Camera Localization
作者:	劉郁昌 Yo-Chung Lau
指導教授:	洪一平 Yi-Ping Hung
關鍵字:	攝影機定位,姿態估測姿態追蹤深度學習延展實境 Camera Localization,Pose EstimationPose TrackingDeep LearningExtended Reality
出版年 :	2026
學位:	博士
摘要:	本論文以攝影機定位為主題，分別從應用實作與方法設計兩個面向進行探討。在應用層面上，我們以沉浸式劇場「迴路花園」為案例，開發Circuit Garden XR (CGXR)行動延展實境應用，利用既有的標記式攝影機定位技術，將360度演出影片精準錨定於原始表演場域，使使用者得以在演出結束後，於實際空間中依自身動線與視角重溫表演歷程。使用者研究顯示，CGXR不僅提升使用者對表演內容的理解，也提供沉浸且具吸引力的互動體驗。同時，經由此研究，我們亦發現攝影機定位技術在運算效率與多使用者支援議題上仍存在實務上的挑戰與限制。有鑑於此，在方法層面上，我們首先聚焦於定位運算效率，提出一套適用於行動平台的即時物體式攝影機定位方法。該方法採用客戶端—伺服器架構的單目慣性輔助視覺追蹤系統，以最小化客戶端的計算負載來實現物體姿態估測。為了提升準確性和可靠性，我們引入了偏差自校正機制和姿態檢測演算法。此外，本研究亦建置了一個包含同步全彩影像和慣性感測單元數據的物體姿態數據集，用於評估與驗證整體定位效能。進一步地，我們將研究範疇擴展至多人支援，並提出CollabLoc，其為一套採用協同計算架構之基於視覺的多使用者空間定位系統。CollabLoc整合動態伺服器資源管理、基於用戶端融合姿態的影像檢索、結合光流的特徵匹配技術，以及用於降低漂移的姿態融合模組，以有效支援已知場景中的即時多人定位。實驗結果顯示，在多使用者情境下，CollabLoc相較於基準方法可將定位處理速度提升約1.91倍，同時維持良好定位精度。 This thesis explores camera localization by examining application-oriented implementation and methodological design. At the application level, we take the immersive theater production Circuit Garden as a case study and develop Circuit Garden XR (CGXR), a mobile XR application. CGXR leverages existing marker-based camera localization technologies to accurately anchor 360° performance videos within the original performance venue, allowing users to revisit the theatrical experience after the show by following their own movement paths and viewing perspectives within the physical space. User studies show that CGXR not only improves users' understanding of the performance content but also provides an immersive and engaging interactive experience. From this work, we also found two practical challenges and limitations in camera localization regarding computational efficiency and multiuser support. In view of this, at the methodological level, we first address the computational efficiency of localization and propose a real-time object-based camera localization method tailored for mobile platforms. The method adopts a client-server monocular inertial-aided visual tracking architecture that minimizes the computational load on the client side to achieve object pose estimation. To further improve accuracy and reliability, we introduce a bias self-correction mechanism and a pose inspection algorithm. In addition, a novel object pose dataset featuring synchronized RGB and IMU data is provided for comprehensive evaluation. Furthermore, we expand the research scope to multiuser support and present CollabLoc, a multiuser system for vision-based spatial localization leveraging collaborative computation. The system incorporates adaptive server-side resource management, client-pose-fused image retrieval, optical-flow-enhanced feature matching, and a pose fusion module for drift reduction, effectively supporting real-time multiuser localization in known scenes. Experimental results show that CollabLoc achieves a 1.91x speedup in localization processing compared to the baseline method in multi-client scenarios while maintaining good localization accuracy.
URI:	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/101567
DOI:	10.6342/NTU202600195
全文授權:	同意授權(限校園內公開)
電子全文公開日期:	2026-02-12
顯示於系所單位：	資訊網路與多媒體研究所

文件中的檔案：

檔案	大小	格式
ntu-114-1.pdf 授權僅限NTU校內IP使用（校園外請利用VPN校外連線服務）	23.46 MB	Adobe PDF

顯示文件完整紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料（如：文字、圖片、PDF）並使其易於取用。