以物件與事件為基礎之視訊內容調適架構

Wen-Huang Cheng; 鄭文皇

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/9784

標題:	以物件與事件為基礎之視訊內容調適架構 A Semantic Framework for Object-Based and Event-Based Video Content Adaptation
作者:	Wen-Huang Cheng 鄭文皇
指導教授:	吳家麟
關鍵字:	多媒體內容調適,語意分析,視訊物件偵測,視訊事件偵測, Multimedia Content Adaptation,Semantic Analysis,Video Object Detection,Video Event Detection,
出版年 :	2008
學位:	博士
摘要:	在普及媒體環境中，內容調適是用以實現普遍多媒體存取的一種關鍵技術。具體而言，其藉由多媒體內容的轉換以使轉換後之多媒體內容符合相對應的使用環境。從個人化應用的角度來看，有效的內容調適可得益於對多媒體內容語意的深刻理解。因此，本論文的目標即在於提供一套具系統化之研究方法以提昇自動化多媒體調適的語意層次。在本論文中，我們提出一個通用型之調適架構以及相對應之基本設計原則。藉由導入特定領域知識，我們適度跨越存在於低階可計算特徵值與高階語意概念間之語意鴻溝，並藉此有效開發與其所屬之調適運算以求得使用者多媒體經驗之最佳化。在前述所提出的架構之上，我們的研究聚焦於視訊內容之語意調適，其中具體探討兩種用於語意模型化之方法，分別是以物件為基礎與以事件為基礎之方法。在以物件為基礎之方法中，我們建構一個可用於定位視訊中具語意性物件之視覺模型，以提昇使用者在小螢幕行動裝置上觀賞高畫質專業影片時之瀏覽經驗。另一方面，在以事件為基礎之方法中，我們同時利用視訊中之視覺與聽覺資訊，以描繪具語意性事件之多媒體特性，並應用於滿足使用者對於長時間家庭影片之實際瀏覽需要。此兩個系統可視為前述所提出調適架構之具體技術實現，並可藉此顯現自動化高階語意分析之可行性與有效性。 In pervasive media environments, adaptation is one key technology to support universal multimedia access by transforming multimedia contents to fit the usage environments. In terms of personalization, effective adaptation can greatly benefit from taking into account the semantics of multimedia contents. The goal of this dissertation is to be able to provide systematic approaches to improve automatic multimedia adaptation at the semantic level. In this dissertation, a generic adaptation framework and the fundamental design principles are proposed. By exploiting specific domain knowledge, we bridge the gap between low-level computational features and high-level semantic concepts, whereby the associated adapting operations can be effectively designed to maximize the user’s multimedia experience. Based on the proposed framework, our works focus on the semantic adaptation of video contents, where two alternative approaches for semantics modeling are investigated: the object-based and the event-based. In the object-based approach, a visual model is constructed for locating semantic video objects so as to improve the user’s browsing experience of high-quality professional videos on the devices with small displays. In the event-based approach, both the visual and aural information are exploited to characterize semantic video events that can be used to benefit the user’s navigation in hours-long home videos. The two systems can be viewed as the technical realization of the proposed adaptation framework and demonstrate the effectiveness of automatic high-level semantics analysis.
URI:	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/9784
全文授權:	同意授權(全球公開)
顯示於系所單位：	資訊網路與多媒體研究所

文件中的檔案：

檔案	大小	格式
ntu-97-1.pdf	5.24 MB	Adobe PDF	檢視/開啟

顯示文件完整紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料（如：文字、圖片、PDF）並使其易於取用。