請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/92576| 標題: | 基於深度學習選擇機制之視覺慣性里程計 Selection Mechanism for Visual Inertial Odometry with Deep Learning Prediction |
| 作者: | 陳昱仁 Yu-Jen Chen |
| 指導教授: | 簡韶逸 Shao-Yi Chien |
| 關鍵字: | 姿態追蹤,姿態估測,相機姿態,深度學習,慣性里程計, Pose Tracking,Pose Estimation,Camera Pose,Deep Learning,IMU, |
| 出版年 : | 2024 |
| 學位: | 碩士 |
| 摘要: | 視覺慣性里程計(VIO)旨在通過估計物體自運動來預測其軌跡。在過去的幾年裡,基於深度學習的VIO方法與傳統的幾何方法相比之下,在效能上表現出優越性。然而,當前的方法對每次預測都輸入視覺和慣性數據。在基於深度學習的方法中,同時使用兩種感測器資料導致了每次預測的計算成本非常高。由於視覺里程計/視覺慣性里程計系統經常在邊緣設備上運行,其計算能力非常有限,且即時預測至關重要,因此需要解決計算成本的問題。
這篇論文介紹了一種基於深度學習的VIO方法,旨在通過在不同情境下有選擇性地停用視覺和慣性資料來減少計算成本。更確切地說,我們引入了一個名為SelectNet的神經網路,其設計是為了學習在當前狀態下何時啟動或停用視覺和慣性特徵提取器。此外,我們採用Gumbel-Softmax resampling,這是一種輕量級的技術,用於訓練SelectNet,使得訓練過程可微分。實驗表明,我們的方法在與其他VIO方法進行比較時取得了競爭性的性能,同時還將計算成本減少了40.13 %。此外,我們還研究了在各種情境下可解釋性的可視化概率軌跡,顯示了視覺和慣性輸入數據之間的有趣聯繫。 Visual inertial odometry (VIO) aims to predict a trajectory by estimating egomotion. Over the past few years, deep learning VIO methods have exhibited superior performance comparing to traditional geometric methods. Nonetheless, current approaches input both visual and inertial data for every estimation. In deep learning based methods, the consistent use both data sources results in a remarkably high computational cost for each estimation. Since VO/VIO systems frequently run on edge devices where real-time implementation is essential, the problem of computational cost needs to be addressed due to the limited computing power available. This thesis introduces a VIO method based on deep learning, aiming to reduce computation costs by selectively deactivating visual and inertial feature in different situations. To be more precise, we introduced a network called SelectNet, which is designed to learn when to activate or deactivate the visual and inertial feature extractors based on the current state. Furthermore, we employ Gumbel-Softmax resampling, a lightweight technique, to train the SelectNet, making the training process differentiable. Extensive experiments demonstrate that our approach achieves competitive performance when compared to other VIO methods while also reducing computation costs by 40.13%. In addition, we investigate the interpretability of visualizing the probability track in various scenarios, unveiling interesting connections between visual and inertial sensory input data. |
| URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/92576 |
| DOI: | 10.6342/NTU202304450 |
| 全文授權: | 未授權 |
| 顯示於系所單位: | 電子工程學研究所 |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| ntu-112-2.pdf 未授權公開取用 | 4.96 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
