請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/101482| 標題: | 基於遞迴最小平方法之實時神經視訊編碼碼率控制 Real-Time Rate Control for Neural Video Coding via Recursive Least Squares |
| 作者: | 陳祈安 Chi-An Chen |
| 指導教授: | 丁建均 Jian-Jiun Ding |
| 關鍵字: | 神經視訊編碼,碼率控制遞迴最小平方法實時視訊通訊DCVC-RT Neural Video Coding,Rate ControlRecursive Least SquaresReal-Time Video CommunicationDCVC-RT |
| 出版年 : | 2026 |
| 學位: | 碩士 |
| 摘要: | 神經視訊編碼 (Neural Video Coding, NVC) 近年來在壓縮效率上已展現出超越傳統視訊編碼標準 (如 H.265/HEVC, H.266/VVC) 的潛力。然而,如何將其實現於實時應用 (如視訊會議、直播) 仍是一大挑戰。儘管 DCVC-RT 等高效能架構已突破了計算複雜度的瓶頸,但目前針對此類先進 NVC 模型的實時碼率控制 (Rate Control) 機制研究仍然匱乏,且現有方法難以適應神經網路特有的非線性與非穩態特徵。
本論文提出了一種針對實時神經視訊編碼器 DCVC-RT 的自適應狀態空間碼率控制系統。首先,透過全域搜索分析,本研究確立了 DCVC-RT 的碼率與量化參數 (QP) 之間呈現對數關係 (Logarithmic R-Q Model),並利用此線性化特性引入遞迴最小平方法 (Recursive Least Squares, RLS) 進行在線參數估計。相較於傳統梯度下降法,RLS 演算法具備極快的收斂速度與穩定性。針對 DCVC-RT 特有的週期性特徵刷新機制,本研究進一步提出了「雙狀態 RLS (Dual-State RLS)」策略,將一般幀與刷新幀的參數估計分離,有效解決了刷新幀作為異常值導致的模型參數劇烈震盪 (Model Shattering) 問題。此外,本系統結合了滑動視窗預算分配、層級化質量結構以及智慧輸出平滑化機制,以確保長期流量的穩定性與視覺品質。 實驗結果顯示,在 UVG、MCL-JCV 與 HEVC Class B 等多個標準數據集上,本研究所提出的方法均優於現有的基準方法。特別是在 UVG 數據集上,相比於基準算法,本方法實現了平均 14.9% 的 BD-Rate 性能提升,並將平均碼率誤差 (BRE) 控制在 1.92% 的高精準度範圍內。同時,計算複雜度分析顯示本方法的額外時間開銷僅約 1%,證實了其在實時應用中的可行性。本研究成功填補了高效能 NVC 與實時傳輸需求之間的缺口,為神經視訊編碼的實際部署提供了關鍵的技術解決方案。 Neural Video Coding (NVC) shows great potential to surpass traditional standards like H.265/HEVC and H.266/VVC in compression efficiency. However, realizing NVC in real-time applications remains challenging. While architectures like DCVC-RT have reduced computational complexity, effective real-time Rate Control (RC) mechanisms for these models are scarce, as existing methods struggle with the non-linear and non-stationary nature of neural networks. This thesis proposes an adaptive state-space RC system specifically designed for the real-time neural video coder, DCVC-RT. Global search analysis reveals a logarithmic relationship between bitrate and the Quantization Parameter (QP) in DCVC-RT. Leveraging this linearity, a Recursive Least Squares (RLS) algorithm is introduced for online parameter estimation, offering superior convergence and stability over gradient-based methods. To address the periodic feature refresh mechanism in DCVC-RT, a "Dual-State RLS" strategy is proposed. By separating parameter estimation for normal and refresh frames, this strategy prevents model shattering caused by outliers. Furthermore, the system integrates sliding window budget allocation, a hierarchical quality structure, and smart output smoothing to ensure traffic stability and visual quality. Experimental results on UVG, MCL-JCV, and HEVC Class B datasets demonstrate that the proposed method significantly outperforms baselines. On the UVG dataset, it achieves a 14.9% BD-Rate improvement and maintains an average Bitrate Error (BRE) of 1.92%. With approximately 1% additional time overhead, the method proves feasible for real-time use. This study bridges the gap between high-performance NVC and real-time transmission needs, offering a key solution for practical deployment. |
| URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/101482 |
| DOI: | 10.6342/NTU202600297 |
| 全文授權: | 未授權 |
| 電子全文公開日期: | N/A |
| 顯示於系所單位: | 電信工程學研究所 |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| ntu-114-1.pdf 未授權公開取用 | 16.82 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
