Skip navigation

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets

Learn More
DSpace logo
English
中文
  • Browse
    • Communities
      & Collections
    • Publication Year
    • Author
    • Title
    • Subject
    • Advisor
  • Search TDR
  • Rights Q&A
    • My Page
    • Receive email
      updates
    • Edit Profile
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 電子工程學研究所
Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/83109
Title: 基於深度學習之注視點預測用於360°全景影片注視點串流
Foveated Streaming with Deep Learning Gaze Prediction on 360° Videos
Other Titles: Foveated Streaming with Deep Learning Gaze Prediction on 360° Videos
Authors: 楊凱翔
Kai-Siang Yang
Advisor: 簡韶逸
Shao-Yi Chien
Keyword: 注視點渲染,注視點串流,影片串流,全景影片,360°影片,虛擬實境,
Foveated Rendering,Foveated Streaming,Video Streaming,Panoramic Videos,360° Videos,Virtual Reality,
Publication Year : 2022
Degree: 碩士
Abstract: 由於近年來虛擬實境(VR)的發展,VR頭戴式顯示器(HMD)配備越來越高解析度的顯示螢幕,以達到更好的沉浸式體驗。在VR HMD中觀看影片時,使用者同一時間只可以看到一個很小的區域,稱為視埠。這種現象會造成在視埠外的區域串流高解析度的影像時,因為使用者無法看到視埠外的區域,而浪費傳送高解析度畫面所使用的頻寬。這個問題的解決方案之一就是注視點串流。注視點串流以高解析度傳輸我們所注視的區域、以低解析度傳輸剩餘的全景圖。注視點串流不僅可以節省網路頻寬,還可以模擬人眼的生理機能以改善視覺輻輳調節衝突。
在本論文中,我們提出了第一個具有未來注視點預測的注視點串流系統。我們的注視點預測模型使用深度學習來提前獲得未來注視點的位置,以加速端到端延遲並提高串流的影片刷新率。此外,我們的注視預測模型不會影響用戶的觀看體驗。我們還提出了分層注視點渲染,它在我們的注視點渲染中結合了Unity的物件。分層注視點渲染是將裁剪後的高解析度區域渲染到Unity中的平面物件,而不是調整全景區域的大小並進行圖像拼接。該技術可以降低計算成本,減少GPU的使用量,在沒有視訊壓縮協定的前提下,我們提出的注視點串流系統可以節省至少90%的頻寬。與其他的注視點串流系統相比,我們實現更低的端到端延遲,更高的串流影片刷新率,並節省更高的頻寬。
Owing to the development of virtual reality (VR), VR head-mounted displays (HMDs) are equipped with higher and higher resolution display monitors to achieve a better immersive experience. While viewing videos in VR HMDs, humans can only see a tiny region, as known as viewport. This phenomenon leads to the waste of Internet bandwidth when streaming the high-resolution region, which is outside the viewport, and users cannot see when streaming 360° panoramic videos in VR HMDs. One of the solutions to deal with this problem is foveated streaming. Foveated streaming is proposed to stream the high-resolution region where we are looking and stream the remaining panoramic region with low resolution. Foveated streaming can not only save Internet bandwidth but also simulate the physiological structure of the human eyes to avoid vergence-accommodation conflict (VAC).
In this thesis, we propose the first foveated streaming system with a gaze prediction model. Our gaze prediction model integrates deep learning into predicting the future gaze information in advance to accelerate the end-to-end latency and increase the operating frame rate. Also, our gaze prediction model does not affect the user's viewing experience. Additionally, We also propose layered foveated rendering, which combines Unity objects and foveated rendering in our foveated streaming system. Layered foveated rendering is to render the cropped high-resolution on the plane object in Unity rather than resizing the panoramic region and performing image stitching. This technique can reduce the computational cost, decrease the rendering burden of GPU, and save at least 90% bandwidth if no video compression protocol is implemented. Compared to other foveated streaming systems, ours achieves lower end-to-end latency, higher operating frame rate, and saves higher bandwidth.
URI: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/83109
DOI: 10.6342/NTU202210004
Fulltext Rights: 同意授權(全球公開)
Appears in Collections:電子工程學研究所

Files in This Item:
File SizeFormat 
U0001-0156221026503014.pdf3.1 MBAdobe PDFView/Open
Show full item record


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved