以卷積神經網路實現之頭戴式視線追蹤系統

Yun-Chu Chen; 陳韻竹

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/77622

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	簡韶逸
dc.contributor.author	Yun-Chu Chen	en
dc.contributor.author	陳韻竹	zh_TW
dc.date.accessioned	2021-07-10T22:12:09Z	-
dc.date.available	2021-07-10T22:12:09Z	-
dc.date.copyright	2018-08-02
dc.date.issued	2018
dc.date.submitted	2018-07-23
dc.identifier.citation	[1] Y. Sugano, Y. Matsushita, and Y. Sato, “Learning-by-Synthesis for Appearance-Based 3D Gaze Estimation,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014, pp. 1821–1828. [2] X. Zhang, Y. Sugano, M. Fritz, and A. Bulling, “Appearance-Based Gaze Estimation in the Wild,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015, pp. 4511–4520. [3] K. Krafka, A. Khosla, P. Kellnhofer, H. Kannan, S. Bhandarkar, W. Matusik, and A. Torralba, “Eye Tracking for Everyone,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 2176–2184. [4] K. He, X. Zhang, S. Ren, and J. Sun, “Deep Residual Learning for Image Recognition,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 770–778. [5] D. Li, D. Winfeld, and D. J. Parkhurst, “Starburst: A hybrid algorithm for video-based eye tracking combining feature-based and model-based approaches,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition Workshop (CVPRW), vol. 3, 2005, pp. 79–79. [6] K.-H. Tan, D. J. Kriegman, and N. Ahuja, “Appearance-based Eye Gaze Estimation,” in Proceedings of IEEE Winter Conference on Applications of Computer Vision (WACV), 2002, pp. 191–195. [7] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet Classifcation with Deep Convolutional Neural Networks,” in Proceedings of Neural Information Processing Systems (NIPS), 2012, pp. 1097–1105. [8] Y. LeCun, L. Bottou, Y. Bengio, and P. Haﬀner, “Gradient-Based Learning Applied to Document Recognition,” in Proceedings of the IEEE, vol. 86, 1998, pp. 2278–2324. [9] B.-C. Chen, P.-C. Wu, and S.-Y. Chien, “Real-time eye localization, blink detection, and gaze estimation system without infrared illumination,” in Proceedings of IEEE International Conference on Image Processing (ICIP), 2015, pp. 715–719. [10] P.-J. Chiu, “Illumination-Robust Appearance-Based Gaze Estimation on Head-Mounted Devices,” M.S. Thesis, National Taiwan University, Taipei, Taiwan, 2017. [11] A. Paszke, S. Gross, S. Chintala, G. Chanan, E. Yang, Z. DeVito, Z. Lin, A. Desmaison, L. Antiga, and A. Lerer, “Automatic diﬀerentiation in PyTorch,” in Proceedings of Neural Information Processing Systems Workshop (NIPSW), 2017. [12] J. Eric, O. Travis, P. Pearu et al., “SciPy: Open source scientifc tools for Python,” 2001–, [Online; accessed <2018-05-30>]. [Online]. Available: http://www.scipy.org/ [13] M. D. Zeiler and R. Fergus, “Visualizing and Understanding Convolutional Networks,” in Proceedings of European Conference on Computer Vision (ECCV), 2014, pp. 818–833. [14] L. van der Maaten and G. Hinton, “Visualizing Data using t-SNE,”Journal of Machine Learning Research (JMLR), vol. 9, pp. 2579–2605, 2008.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/77622	-
dc.description.abstract	從科學研究到商業應用如人機介面、心理學、人類行為，眼神追蹤均為一個極重要的工具，尤其在虛擬實境與擴增實境等頭戴式裝置內，視線資訊更可以做為一個最自然的使用者輸入。然而市面上多數眼神追蹤技術都需要一到數個紅外光光源來定位與追蹤瞳孔輪廓，使得許多眼動儀在有太陽光的室外環境下，精準度大幅下降甚至無法運作，擴增實境的使用範圍也因此受限。為解決這個問題，架設於頭戴式裝置內、單純基於可見光的圖像表徵演算法應運而生。然而，儘管藉由多點校正可以讓圖像表徵演算法有一定的精準度，由於此類演算法使用整張的眼睛圖片，若多張圖片之間因相機位移導致對齊程度下降，演算法的精準度亦會同時劇烈下滑。因此，比起直接使用未經處理的原始圖片資訊，我們需要抽取更穩定、更具代表性的特徵。基於卷積神經網路的蓬勃發展，在這份研究裡，我們藉由卷積神經網路來抽取眼睛圖片的抽象資訊，再將此抽象特徵經由九點線性校正技術，來投射出相對應的二維視線座標點。實驗結果顯示，在不需額外追蹤擷取眼睛區域、涵蓋 80 種隨機光線的測試下，本研究提出之眼神追蹤系統針對 10 位受試者的精準度可以降到平均 2 度誤差內。	zh_TW
dc.description.abstract	From researches to commercial products such as psychology, human behavior and human-machine interface, eye-tracking is an important tool across many domains; especially in head-mounted devices, for example, Virtual-Reality(VR) and Augmented-Reality(AR), gaze information is the most natural user input. However, most of the commercial eye-tracking techniques use several infrared ray(IR) illuminators as additional light sources to detect and track the pupil contour. When using these eye-trackers in the outdoor environments under the sunlight, most of them can not work, which limits the usage of wearable AR devices. To address this issue, head-mounted, appearance-based algorithms have been proposed recently. However, since these kinds of algorithm directly process the raw eye images, they need multiple calibrating points to achieve the same accuracy as IR-based methods; when eye images not keeping aligned due to camera-eye shift, the error of eye gaze prediction will dramatically increase. In this research, owing to the rapid development of Convolutional Neural Network(CNN), we take advantage of the CNN model to extract robust and representative eye features, then apply a 9-point linear combination tenique on those features to map the eye appearance to the 2D gaze position. From the experimental results, the proposed eye-tracking system can achieve a mean error less than 2 degrees for 10 testing subjects, under 80 lighting conditions without detecting and cropping the eye region.	en
dc.description.provenance	Made available in DSpace on 2021-07-10T22:12:09Z (GMT). No. of bitstreams: 1 ntu-107-R05943006-1.pdf: 5262419 bytes, checksum: 0cf888cbf8fe461122120d50fd250af0 (MD5) Previous issue date: 2018	en
dc.description.tableofcontents	Abstract iv List of Figures vii List of Tables ix 1 Introduction 1 1.1 Introduction of Eye Tracking . . . . . . . . . . . . . . . . . . 1 1.2 Introduction of Convolutional Neural Network . . . . . . . . . 2 1.3 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.4 Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2 Related Work 5 2.1 Learning-by-Synthesis for Appearance-based 3D Gaze Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.2 Appearance-Based Gaze Estimation in the Wild . . . . . . . . 6 2.3 Eye Tracking for Everyone . . . . . . . . . . . . . . . . . . . . 7 2.4 Summary of Related Works . . . . . . . . . . . . . . . . . . . 9 3 Proposed Method 11 3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 3.2 Algorithm and CNN Architecture . . . . . . . . . . . . . . . . 12 3.2.1 Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . 12 3.2.2 Feature Extraction by Convolutional Neural Network . 14 3.2.3 Linear Combination and Gaze Mapping . . . . . . . . 16 4 Experiments 18 4.1 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . 18 4.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 4.2.1 Eye Corner Localization . . . . . . . . . . . . . . . . . 18 4.2.2 Gaze Estimation . . . . . . . . . . . . . . . . . . . . . 19 4.2.3 Short Discussion About CNN . . . . . . . . . . . . . . 20 4.3 Observations . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 4.3.1 CNN Intermediate Output . . . . . . . . . . . . . . . . 21 4.3.2 t-SNE Plot . . . . . . . . . . . . . . . . . . . . . . . . 23 4.4 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 4.4.1 Ablation Test . . . . . . . . . . . . . . . . . . . . . . . 25 4.4.2 Model-Saving Criteria . . . . . . . . . . . . . . . . . . 26 4.5 Comparisons . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 4.5.1 Lighting Condition . . . . . . . . . . . . . . . . . . . . 27 4.5.2 Number of Calibrating Points . . . . . . . . . . . . . . 28 4.5.3 Output Value Scaling . . . . . . . . . . . . . . . . . . 30 4.5.4 Dataset Random Split . . . . . . . . . . . . . . . . . . 31 4.5.5 Cropped and Warpped Eye Image . . . . . . . . . . . 32 4.5.6 Compared with Other Work . . . . . . . . . . . . . . . 32 5 Conclusion 34 Reference 35 A Gaze Error Map 38 B Results of Other Training Settings 42
dc.language.iso	en
dc.subject	可見光	zh_TW
dc.subject	視線追蹤	zh_TW
dc.subject	卷積神經網路	zh_TW
dc.subject	頭戴式	zh_TW
dc.subject	Head-Mounted	en
dc.subject	Visible Light	en
dc.subject	Convolutional Neural Network	en
dc.subject	Eye-Tracking	en
dc.title	以卷積神經網路實現之頭戴式視線追蹤系統	zh_TW
dc.title	Head-Mounted Eye-Tracking System with Convolutional Neural Network	en
dc.type	Thesis
dc.date.schoolyear	106-2
dc.description.degree	碩士
dc.contributor.oralexamcommittee	陳宏銘,葉素玲,張家銘
dc.subject.keyword	視線追蹤,卷積神經網路,頭戴式,可見光,	zh_TW
dc.subject.keyword	Eye-Tracking,Convolutional Neural Network,Head-Mounted,Visible Light,	en
dc.relation.page	42
dc.identifier.doi	10.6342/NTU201800903
dc.rights.note	未授權
dc.date.accepted	2018-07-24
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	電子工程學研究所	zh_TW
顯示於系所單位：	電子工程學研究所

文件中的檔案：

檔案	大小	格式
ntu-107-R05943006-1.pdf 未授權公開取用	5.14 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。