量體視訊串流下之多視角轉碼最佳化

吳奕寶; Yi-Pao Wu

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/91088

標題:	量體視訊串流下之多視角轉碼最佳化 Optimization of Multiview Generation for Volumetric Video Streaming with Edge Transcoding
作者:	吳奕寶 Yi-Pao Wu
指導教授:	謝宏昀 Hung-Yun Hsieh
關鍵字:	虛擬實境,量體視訊,影像串流,邊緣渲染,競爭式學習向量量化, virtual reality,volumetric video,video streaming,edge rendering,competitive learning vector quantization,
出版年 :	2023
學位:	碩士
摘要:	實現元宇宙的主要研究議題之一是量體視訊的串流，其需要極高的頻寬消耗、極低的延遲要求以及顯著的解碼負擔。本研究探索利用邊緣渲染 (edge rendering) 的串流系統，系統中根據視野預測結果對量體視訊2D視角進行轉碼。然而，視野預測的不準確性可能會因為其在偏移視點上降產生畫面而降低轉碼影像的品質。在最先進的邊緣輔助量體視訊串流系統中，選擇生成多個轉碼視角的位置是根據均勻步長移動預測位置，這種方法沒有將視野預測模型不同的準確性納入考慮，可能會顯著降低預渲染視圖的品質。本研究將虛擬視角合成技術納入串流系統並建立分析轉碼畫面的品質模型，該模型代表畫面品質與位置偏移之間的關係。基於充分的模擬結果，將模擬結果得到的品質模型作為目標，本研究提出建立在最佳量化問題之上的最佳化框架，用於選擇生成多個轉碼視角的位置，以最佳化期望品質，考慮了實證模擬結果而非僅依賴歐氏距離。基於這個最佳化框架，本研究設計了一種結合無梯度最佳化方法與競爭性學習向量量化的演算法，該演算法考慮用戶位置的機率分佈和品質模擬結果，動態地決定最佳的視角轉碼位置。我們的模擬結果顯示，我們提出的演算法相比最先進的量體視訊串流系統方法，在影片串流過程中可以在55%至83% 的時間帶來畫面品質的提升。 One of the major research topics to enable the metaverse is the streaming of volumetric video, which comes with ultra-high bandwidth consumption, ultra-low latency requirement, and significant decoding overhead. This work explores the utilization of a streaming system with edge rendering, where 2D views of volumetric video are transcoded at the edge server according to the viewport prediction result. However, inherent inaccuracy of viewport prediction may degrade the quality of transcoded frames that are rendered at a deviated viewpoint. In the state-of-the-art edge-assisted volumetric streaming system, positions to generate multiple transcoded views are selected by shifting predicted position with uniform step size, in which a multiview generation approach without considering varying viewport prediction accuracy could significantly degrade the quality of pre-rendered views. This work incorporates virtual view synthesis techniques into the streaming system and establishes a quality model representing the relation between quality and position deviation based on thorough simulation results. With the quality model as a target, an optimization framework built upon the optimal quantization problem is formulated to select the positions for generating multiple transcoded views that optimize expected quality, taking into account empirical simulation results instead of relying solely on Euclidean distance. We propose an algorithm integrating concepts of gradient-free optimization with competitive learning vector quantization process to achieve maximal expected quality. The algorithm judiciously determines the best positions to transcode the views in consideration of the probability distribution of user's position and empirical simulation results. Our evaluations indicate that our proposed algorithms outperform baseline methods proposed by state-of-the-art transcoded volumetric streaming system with an improvement ratio ranging from 55% to 83% on a segment-by-segment basis.
URI:	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/91088
DOI:	10.6342/NTU202303932
全文授權:	同意授權(全球公開)
電子全文公開日期:	2026-01-01
顯示於系所單位：	電信工程學研究所

文件中的檔案：

檔案	大小	格式
ntu-111-2.pdf	11.33 MB	Adobe PDF	檢視/開啟

顯示文件完整紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料（如：文字、圖片、PDF）並使其易於取用。