請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/8862
標題: | 利用音樂查詢之影像檢索系統 An Image Retrieval System Using Music as Query |
作者: | Chao-Liang Hsu 徐兆良 |
指導教授: | 鄭士康(Shyh-Kang Jeng) |
關鍵字: | 影像檢索,跨媒體檢索,元資料,相關性回饋, Image retrieval,cross-media retrievallmetadata,search,relevance feedback, |
出版年 : | 2009 |
學位: | 碩士 |
摘要: | 本論文提出一個新的影像檢索的方法,利用音樂做為查詢。不同於一般影像檢索的方法,大部分是利用關鍵字,或是其他的影像做為查詢。也就是說,我們提出的是跨媒體類別的檢索系統。在網路上,影像和音樂都伴隨有許多文字資訊(Metadata,元資料),而在我們的方法中,這些文字資訊被運用為音樂和影像之間的連結。利用一個從Okapi BM25所衍生而得的計算排名分數的函式,從文字資訊上計算音樂和影像之間的關聯程度,然後利用機率潛在語義分析模型(PLSA, Probabilistic Latent Semantic Analysis),計算音樂和影像的隱藏語意特徵(HSF, Hidden Semantic Feature),並且利用類神經網路(Neural Network)的技術,訓練出一個從音樂音訊特徵( Audio Feature)至隱藏語意特徵(HSF)的映射函數。在影像檢索的階段,音樂和影像的隱藏語意特徵和文字資訊被用作計算之間關聯性的基礎。最後,透過使用者的相關性回饋(Relevance Feedback)來增進影像檢索的效果,其中可分為短期學習及長期學習,前者為影像重新排名(Image Reranking),後者為更新音樂-影像描述文字對照表(Music-Image Descriptive Word Map)。為評估此影像檢索系統的效果,從Flickr取得了4000張圖片及其對應的文字資訊,以及取得了2000首歌曲,並且從AMG(All Music Guide)取得其對應的文字資訊。而實驗結果顯示,本系統可達到相當不錯的效果。 In this paper, a novel image retrieval approach is proposed. Differ from traditional image retrieval approaches, which generally retrieve images using keywords or example images as query, the image retrieval system proposed allows the user to search images using music as query. Namely, a music-image cross-media retrieval system is developed. There is rich textual information associated with music and image on the web, and the textual information is used to bridge the semantic gap between music and image in our research. The relevance of music and image are measured by a ranking function derived from Okapi BM25. Music-image semantic matrix is constructed based-on textual information of music and image, and PLSA (Probabilistic Latent Semantic Analysis) is applied on it to measure HSF (hidden semantic feature) of music and image. Neural Network is used to train a mapping function from music audio feature to HSF. In the phase of image retrieval, the music-image retrieval is based on HSF and textual feature. Finally, user relevance feedback is used for image reranking (short-term learning) and updating the music-image descriptive word map (long-term learning) to enhance the retrieval results. To evaluate the image retrieval system, 4000 images with textual information (metadata) are collected from Flickr, 1836 songs are collected and textual information (metadata) of these songs are collected from AMG(All Music Guide). The results show that this image retrieval system can achieve good performance. |
URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/8862 |
全文授權: | 同意授權(全球公開) |
顯示於系所單位: | 電機工程學系 |
文件中的檔案:
檔案 | 大小 | 格式 | |
---|---|---|---|
ntu-98-1.pdf | 1.16 MB | Adobe PDF | 檢視/開啟 |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。