請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/50164
標題: | 利用球狀空間組件模型應用於3D手部姿勢估計之深層神經網路學習 Learning a Deep Network with Spherical Part Model for 3D Hand Pose Estimation |
作者: | Tzu-Yang Chen 陳子揚 |
指導教授: | 傅立成 |
關鍵字: | 手部姿勢估計,深度學習,卷積神經網路,球狀空間組件模型, Hand pose estimation,Deep learning,Convolutional neural network,Spherical part model, |
出版年 : | 2016 |
學位: | 碩士 |
摘要: | 3D手部姿勢估計在近年來是個非常熱門的研究主題,由於其能夠提供一個人跟電子空間自然的溝通介面來創造良好的使用者體驗,其已被廣泛地使用在許多虛擬實境以及人機互動的應用上。
然而儘管此領域的快速發展,3D手部姿勢估計仍然是個充滿困難的問題。首先,我們需要正確地從複雜的環境中偵測使用者的手,然後手卻是一個會因視角、距離和使用者而改變外觀的物件,導致了偵測的困難。再者,人的手是個擁有相當高的自由度且容易自我遮蔽的物體,也因此,在姿勢估計上也有相當的挑戰。然而,在近年來崛起深度相機和深度學習帶來了完成此困難問題的可能性。 這此篇論文中,我們將以設立一個利用深度影像可以準確偵測人體手部並且精準推估手部姿勢的3D手部姿勢估計系統為目標。為了保證我們系統的穩定性,我們設計了一個名為球狀空間組件模型的手部模型,並利用此模型來訓練深層卷積神經網路,另外,為了減少人為的疏失,我們更利用了一個資料驅動的方式將其整合,因此,我們的神經網路更能基於手部模型的先驗知識來精準的推估手部姿勢。 為了展示我們方法的優勢,我們利用兩個公開及一個自我蒐集的資料庫做了一個完整的實驗,其結果展示了我們系統可以用相當於90百分比的平均準確度來偵測手部並且在手勢估計只有大概10毫米的平均誤差,如此的結果在與其他先進的技術相比有著顯著的進步。 3D hand pose estimation, which is to find the locations of joints in a hand, is a hot research topic in recent years. It has been widely used in many advanced applications for virtual reality and human-computer interaction, since it provides a natural interface for communication between human and cyberspace that achieve a wonderful user experience. Despite the fast development of this field, 3D hand pose estimation is still a difficult task due to the various challenges. First, we need to detect human hand, which is changeable based on different viewpoints, distance, and subjects, from complex environment. Second, the high degree of freedom and serious self-occlusions lead to difficulties in pose estimation. However, the rise of depth sensors and deep learning brings the possibility to accomplish such a challenging task. In this thesis, we aim to build a 3D hand pose estimation system which can correctly detect human hand and accurately estimate its pose using depth images. To guarantee the robustness of our system, we design a hand model called spherical part model (SPM), and train a deep convolutional neural network using this model. Moreover, to reduce the influence of human’s omissions, we use a data-driven approach to integrate them together. As a result, our network can more accurately estimate hand pose based on prior knowledge of human hand. To demonstrate the superiority of our method, a complete experiment is conducted on two public and one self-build datasets. The results show that our system can detect human hands with almost 90% in average precision and achieve about 10 millimeters of average error distance on pose estimation that much outperforms other state of the art works. |
URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/50164 |
DOI: | 10.6342/NTU201601892 |
全文授權: | 有償授權 |
顯示於系所單位: | 資訊工程學系 |
文件中的檔案:
檔案 | 大小 | 格式 | |
---|---|---|---|
ntu-105-1.pdf 目前未授權公開取用 | 3.16 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。