Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊網路與多媒體研究所
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/97370
標題: 基於自適應採樣和混合專家模型的短影音品質分析系統
EXPERT-VQA: Ensemble Expert Prediction with Adaptive Frame Selection for Short-Form Video Quality Assessment
作者: 黃致豪
Chih-Hao Huang
指導教授: 廖世偉
Shih-Wei Liao
關鍵字: 影像品質評估,用戶生成短影音,內容感知禎數選取,混合專家模型,自適應品質校準,影像理解,
Video Quality Assessment,User-Generated Short-form Video,Context-Aware Frame Selection,Mixture of Experts,Adaptive Quality Calibration,Video Understanding,
出版年 : 2025
學位: 碩士
摘要: 用戶生成內容影像品質評估(UGC-VQA)主要針對社群平台上用戶自行拍攝上傳的影片進行影像品質評估,在近年來隨著社群平台的盛行,用戶生成內容影像的數量也急劇增加,該問題變得愈發重要,由於用戶拍攝的影片常含有不穩定的畫面品質、不同的壓縮設定以及多樣化的創意特效,要如何準確地量化和預測用戶的觀影體驗,成為維持社群平台影片內容的水準和用戶觀影品質的關鍵。
然而,隨著短影音的興起,快速剪輯、特殊濾鏡和跳接之類的效果更為常見,短影音和傳統長影音的表現手法的不同,造成用戶對於短影音的觀影體驗和長影音有落差,導致傳統的 UGC-VQA 方法在短影音品質評估上面臨新挑戰。例如,固定的禎取樣策略往往無法取樣到關鍵轉場,或是錯把富含創意的特殊濾鏡當成品質失真,導致現有的模型在評估短影音時產生品質低估的問題。基於這些觀察,我們提出 EXPERT-VQA 來解決既有的方法上的這些挑戰:首先,我們採用自適應幀取樣策略(APT-FS),來有效擷取最具代表性的片段;接著,我們融合多個已訓練好的專家模型,並加入一個輕量化的閘門網路來動態決定不同專家貢獻的權重;最後,藉由品質分數校正模組,我們針對短影音使用者對於影片品質的期待和既有模型預測的落差間進行系統性偏誤修正。實驗結果證實了此框架不論是在相關性或是誤差的指標上均優於現有方法,特別能處理具有頻繁轉場或強烈風格化的短影音。我們的主要貢獻在於:(1)提出自適應影像中幀取樣策略,補足過去固定頻率禎取樣策略上的不足,(2)利用多專家模型融合多重品質評估的面向,(3)透過校正模組解決既有模型在短影音上典型的負向偏差問題。這些方法成功讓影像品質評估在短影音上更貼近真實的使用者的觀影體驗,得到更符合真實的影像品質評估分數。
User-generated content video quality assessment (UGC-VQA) tackles the task of evaluating videos that users record and share on social media. As online platforms expand dramatically, the number and variety of these videos have increased significantly. This growth makes it critical to accurately measure viewer experience, even when faced with challenges such as inconsistent video quality, different compression techniques, and a range of creative visual effects. Traditional UGC-VQA methods, originally developed for longer videos, often fall short on short-form content. Such videos typically feature rapid edits, abrupt transitions, and distinctive stylistic filters, which can lead to a consistent underestimation of the quality perceived by viewers.
In response, this thesis introduces EXPERT-VQA, a novel framework specifically designed for short-form video quality assessment. Our approach tackles the problem through three key innovations. First, we propose the Adaptive and Perceptual Transition Frame Selection (APT-FS) method to dynamically identify and select frames that capture the most significant visual changes. This method overcomes the limitations of fixed-rate sampling. Second, we integrate multiple pre-trained VQA models, each excelling in different quality aspects, by employing a lightweight learnable gating network that fuses their predictions while preserving their individual strengths. Finally, we employ a calibration module to correct for the systematic bias observed in existing models. This correction ensures that the final quality score aligns more closely with actual viewer perceptions.
Experimental evaluation on the YouTube SFV+HDR dataset demonstrates that EXPERT-VQA achieves superior performance, yielding higher correlation with human opinions and lower prediction errors compared to current state-of-the-art methods. Ablation studies further confirm that the APT-FS module, multi-expert fusion, and calibration process each contribute significantly to the overall improvements.
In conclusion, this work provides a basis for assessing short-form video quality. The results show that adaptive frame selection, expert fusion, and calibration help reduce the difference between algorithmic predictions and human ratings. EXPERT-VQA may be used as a flexible and effective framework for video quality evaluation on social media. This work may also help guide future research in video quality assessment.
URI: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/97370
DOI: 10.6342/NTU202500939
全文授權: 未授權
電子全文公開日期: N/A
顯示於系所單位:資訊網路與多媒體研究所

文件中的檔案:
檔案 大小格式 
ntu-113-2.pdf
  未授權公開取用
9.93 MBAdobe PDF
顯示文件完整紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved