請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/8596
標題: | 多維度音樂情緒辨識及音樂檢索 Dimensional Music Emotion Recognition for Content Retrieval |
作者: | Yi-Hsuan Yang 楊奕軒 |
指導教授: | 陳宏銘(Homer Chen) |
關鍵字: | 音樂檢索,情緒辨識,情緒分類, Music information retrieval,music emotion recognition,content-based analysis, |
出版年 : | 2010 |
學位: | 博士 |
摘要: | 音樂在人類歷史中扮演很重要的角色,在數位化時代的當下更是如此,我們隨時隨地都能夠接觸到非常大量的數位音樂內容。在這種情況之下,如何快速有效地找到自己想要聆聽的歌曲,是個很重要的議題。由於幾乎所有音樂都是被創造來傳達某種情緒,所以藉由情緒來檢索和管理音樂會是很有效的。是故,本論文的主旨就是建立一音樂情緒辨識及音樂檢索系統。
傳統上音樂情緒檢索是藉由將音樂分類成好幾種情緒來達到的,然而此種做法的一大缺點是無法選定一適當的情緒分類法則來滿足使用者的需求。有鑑於此,我們提出了一個全新的概念,將音樂情緒定義在一個由心理學家所提出的由正負度-高亢度展開之二維情緒平面上,藉此消除了將音樂分類時所會遇到的模糊性以及粒子性的問題。更具體的說,我們利用迴歸分析的方式,求取每一首歌曲的正負度以及高亢度,如此一來,此首歌曲便可以依據其情緒被表現在一個二維度之空間上。因為此一二維介面提供了很簡單且便利的使用者介面,所多新穎的基於情緒之音樂檢索及管理的方法便能夠很容易的被建立,大大地便利了手機及MP3 player等行動裝置的使用。 根據此一情緒辨識架構,我們更進一步提出了一個基於排序所產生的情緒標註及系統訓練的方式,大大地減低了蒐集情緒標註的困難性並提高了系統訓練的準確度。我們也研究了許多個人化以及使用機率估測的方式來降低人類對於情緒感知的主觀性對於情緒辨識系統的影響,以提升辨識系統的實用性。除此之外,我們並研究了許多中間的音樂特徵,包含從歌詞、和弦、以及曲風等資料所抽取的音樂特徵來提高情緒辨識的準確率。所以我們提來的方法都經過仔細的實驗證明為有效。 Music plays an important role in human's history, even more so in the digital age. Never before has such a large collection of music been created and accessed daily by people. Because almost all music is created to convey emotion, music organization and retrieval by emotion is a reasonable way for accessing music information. A typical approach to music emotion recognition (MER) categorizes emotions into a number of classes and applies machine learning techniques to train a classifier. This approach, however, faces a granularity issue that the number of emotion classes is too small in comparison with the richness of emotion perceived by humans. In this dissertation, we propose to view emotions from a dimensional perspective and define emotions as points in a 2D plane in terms of valence (how positive or negative) and arousal (how exciting or calming) --- the two emotion dimensions found to be most fundamental by psychologists. In this approach, MER becomes the prediction of the valence and arousal values of a song corresponding to a point in the emotion plane. This way, the granularity and ambiguity issues associated with emotion classes no longer exist since no categorical class is needed. Moreover, because the 2D plane provides a simple means for user interface, novel emotion-based music organization, browsing, and retrieval can be easily created for mobile devices that have small display area. Based on this computational model, we further develop a ranking-based approach to reduce the effort of ground truth collection and investigate several methods that personalize the MER system and mitigate the personal differences in emotion perception. A novel computational framework that regards the emotion perception of a music piece as a probabilistic distribution is developed. In addition, mid-level features such as lyrics, chord, and genre are also leveraged to improve MER. The effectiveness of the proposed approaches is validated through extensive qualitative and quantitative studies. |
URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/8596 |
全文授權: | 同意授權(全球公開) |
顯示於系所單位: | 電信工程學研究所 |
文件中的檔案:
檔案 | 大小 | 格式 | |
---|---|---|---|
ntu-99-1.pdf | 7.78 MB | Adobe PDF | 檢視/開啟 |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。