請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/7430完整後設資料紀錄
| DC 欄位 | 值 | 語言 |
|---|---|---|
| dc.contributor.advisor | 陳宏銘 | |
| dc.contributor.author | Yi-Wei Chen | en |
| dc.contributor.author | 陳怡瑋 | zh_TW |
| dc.date.accessioned | 2021-05-19T17:43:27Z | - |
| dc.date.available | 2023-08-20 | |
| dc.date.available | 2021-05-19T17:43:27Z | - |
| dc.date.copyright | 2018-08-20 | |
| dc.date.issued | 2018 | |
| dc.date.submitted | 2018-08-18 | |
| dc.identifier.citation | REFERENCES
[1] Y. H. Yang and H. H. Chen, “Machine recognition of music emotion: A review,” ACM Trans. Intelligent Sys. Technology, vol. 3, no. 3, p. 40, May 2012. [2] M. Barthet, G. Fazekas, and M. Sandler, “Music emotion recognition: From content to context-based models,” in Int. Symposium on Comput. Music Modeling and Retrieval, Berlin, Germany: Springer-Verlag, 2013, pp. 228–252. [3] A. Aljanaki, Y. H. Yang, and M. Soleymani, “Emotion in music task at mediaeval 2014,” in Proc. MediaEval Workshop, Barcelona, Spain, vol. 1263, 2014. [4] X. Hu, and Y. H. Yang, “The mood of Chinese Pop music: Representation and recognition,” Journal of the Association for Information Science and Technology, vol. 68, no. 8, pp. 1899–1910, 2017. [5] X. Hu and J. H. Lee, “A cross-cultural study of music mood perception between American and Chinese listeners,” in Int. Soc. Music Information Retrieval, 2012, pp. 535–540. [6] K. Kosta, Y. Song, G. Fazekas, and M. B. Sandler, “A study of cultural dependence of perceived mood in Greek music,” in Int. Soc. Music Information Retrieval, 2013, pp. 1–6. [7] X. Hu, and Y. H. Yang, “Cross-dataset and cross-cultural music mood prediction: A case on Western and Chinese Pop songs,” in IEEE Trans. Affective Comput., vol. 8, no. 2, pp. 228–240, 2017. [8] E. Tzeng, J. Hoffman, K. Saenko, and T. Darrell, “Adversarial discriminative domain adaptation,” in Comput. Vision and Pattern Recognition, vol. 1, no. 2, p. 4, July 2017. [9] G. Csurka, “Domain adaptation for visual applications: A comprehensive survey,” arXiv preprint arXiv:1702.05374, 2017. [10] E. Coutinho, J. Deng, and B. Schuller, “Transfer learning emotion manifestation across music and speech,” in Int. Joint Conf. Neural Networks, July 2014, pp. 3592–3598. [11] X. Hu and Y. H. Yang, “A study on cross-cultural and cross-dataset generalizability of music mood regression models,” in Int. Sound Music Comput. Conf., 2014, pp. 1149–1155. [12] Y. H. Yang and X. Hu, “Cross-Cultural music mood classification: A comparison on English and Chinese songs,” in Int. Soc. Music Information Retrieval, Oct. 2012, pp. 19–24. [13] T. Eerola, “Are the emotions expressed in music genre-specific? An audio-based evaluation of datasets spanning classical, film, pop and mixed genres,” Journal of New Music Research, vol. 40, no. 4, pp. 349–366, 2011. [14] M. Long, Y. Cao, J. Wang, and M. I. Jordan, “Learning transferable features with deep adaptation networks,” arXiv preprint arXiv:1502.02791, 2015. [15] E. Tzeng, J. Hoffman, N. Zhang, K. Saenko, and T. Darrell, “Deep domain confusion: Maximizing for domain invariance,” arXiv preprint arXiv:1412.3474, 2014. [16] B. Sun, and K. Saenko, “Deep coral: Correlation alignment for deep domain adaptation,” in European Conf. Comput. Vision, Springer, Cham, Oct. 2016, pp. 443–450. [17] B. Sun, J. Feng, and K. Saenko, “Return of Frustratingly Easy Domain Adaptation,” in Association for the Advancement of Artificial Intelligence, vol. 6, no. 7, p. 8, Feb. 2016. [18] M. Ghifary, W. B. Kleijn, M. Zhang, D. Balduzzi, and W. Li, “Deep reconstruction-classification networks for unsupervised domain adaptation,” in European Conf. Comput. Vision, Springer, Cham., Oct. 2016, pp. 597–613. [19] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, …, and Y. Bengio, “Generative adversarial nets”, in Advances in neural information process. sys., 2014, pp. 2672–2680. [20] J. Hoffman, E. Tzeng, T. Park, J. Y. Zhu, P. Isola, K. Saenko, ... and T. Darrell, “Cycada: Cycle-consistent adversarial domain adaptation,” arXiv preprint arXiv:1711.03213, 2017. [21] P. Haeusser, T. Frerix, A. Mordvintsev, and D. Cremers, “Associative domain adaptation,” in Int. Conf. Comput. Vision, vol. 2, no. 5, p. 6, Oct. 2017. [22] S. Sankaranarayanan, Y. Balaji, C. D. Castillo, and R. Chellappa, “Generate to adapt: Aligning domains using generative adversarial networks,” arXiv e-prints, abs/1704.01705, 2017. [23] M. Arjovsky, S. Chintala, and L. Bottou, “Wasserstein gan,” arXiv preprint arXiv:1701.07875, 2017. [24] R. M. Bittner, B. McFee, J. Salamon, P. Li, and J. P. Bello, “Deep salience representations for f0 estimation in polyphonic music,” in Int. Soc. Music Information Retrieval, Suzhou, China, Oct. 2017, pp. 23–27. [25] P. Grosche, M. Muller, and F. Kurth, “Cyclic tempogram—a midlevel tempo representation for music signals,” in IEEE Trans. Acoustics Speech Signal Process., 2010, pp. 5522–5525. [26] Y. A. Chen, Y. H. Yang, J. C. Wang, and H. Chen, “The AMG1608 dataset for music emotion recognition,” in IEEE Trans. Acoustics, Speech and Signal Process., Apr. 2015, pp. 693–697. [27] Y. S. Huang, S. Y. Chou, and Y. H. Yang, “Pop Music Highlighter: Marking the Emotion Keypoints,” arXiv preprint arXiv:1802.10495, 2018. [28] Y. H. Yang, and H. H. Chen, “Predicting the distribution of perceived emotions of a music signal for content retrieval,” in IEEE Trans. Audio, Speech, and Language Process., vol. 19, no. 7, pp. 2184–2196, Sep. 2011. [29] S. Ravi, and H. Larochelle, “Optimization as a model for few-shot learning”, in Int. Conf. Learning Representations, 2017. [30] J. Snell, K. Swersky, and R. Zemel, “Prototypical networks for few-shot learning,” in Advances in Neural Information Process. Sys., pp. 4077-4087, 2017. | |
| dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/7430 | - |
| dc.description.abstract | 對於建立自動音樂情緒辨識系統而言,收集音樂給人的情緒感受標記是必須的。迄今,大部分的音樂情緒資料集都是以西洋歌曲為主。若音樂情緒辨識系統是以西洋曲風的資料集建立的,此系統可能沒辦法適用於非西洋曲風的歌曲,因這兩個曲風受文化背景的影響,在音樂特徵上以及標記者的情緒感受上皆有不同之處。即使這樣的問題已在跨文化以及跨資料集的研究中被發現,但很少有研究探討如何將用收集到的曲風資料集訓練的模型重新訓練以適應於我們感興趣的曲風上。在本篇論文中,我們提出以非監督式對抗式域適應之方法來解決這個問題。此方法應用了類神經網路之模型使兩曲風學到的表徵無法被區分。又情緒感受本身包含了許多面向,因此我們考慮了與音色、音高、以及節奏性相關之三種輸入特徵來評估模型之成效。結果顯示以西洋流行歌曲訓練的模型透過我們提出的方法可大幅改善用中文歌曲預測情緒正負向的準確率。 | zh_TW |
| dc.description.abstract | Annotation of the perceived emotion of a music piece is needed for an automatic music emotion recognition system. To date, the majority of music emotion datasets are for Western pop songs. A music emotion recognizer trained on such datasets may not work well for non-Western pop songs due to the differences in acoustic characteristics and emotion perception that are inherent to cultural background. Although the problem was also found in cross-cultural and cross-dataset studies, little has been done to learn how to adapt a model pre-trained on a source music genre to a target music genre of interest. In this paper, we propose to address the problem by an unsupervised adversarial domain adaptation method. It employs neural network models to make the target music indistinguishable from the source music in a learned feature representation space. Because emotion perception is multifaceted, three types of input features related to timbre, pitch, and rhythm are considered for performance evaluation. The results show that the proposed method effectively improves the prediction of the valence of Chinese pop songs from a model trained for Western pop songs. | en |
| dc.description.provenance | Made available in DSpace on 2021-05-19T17:43:27Z (GMT). No. of bitstreams: 1 ntu-107-R02942096-1.pdf: 3172971 bytes, checksum: 5509d6d653cd64fcb9466baf3edcbbe0 (MD5) Previous issue date: 2018 | en |
| dc.description.tableofcontents | 口試委員會審定書 i
誌謝 ii 中文摘要 iii ABSTRACT iv CONTENTS v LIST OF FIGURES vi LIST OF TABLES vii Chapter 1 INTRODUCTION 1 Chapter 2 RELATED WORK 4 2.1 Music Emotion Recognition 4 2.2 Domain Adaptation 5 Chapter 3 METHODOLOGY 7 3.1 Pre-training 8 3.2 Adversarial Discriminative Domain Adaptation 9 Chapter 4 NETWORK ARCHITECTURE 13 4.1 Log-mel-spectrogram Encoder 14 4.2 Pitch Encoder 15 4.3 Autocorrelation-Based Tempogram Encoder 16 4.4 Regressor and Discriminator 16 4.5 Fusion 17 Chapter 5 EXPERIMENTS SETTING 18 5.1 Datasets 18 5.2 Training Parameters 19 5.3 Baseline 19 Chapter 6 RESULTS AND DISCUSSION 21 6.1 Analysis of Training ADDA 21 6.2 Within-Dataset Experiment 23 6.3 Cross-Dataset Experiment 26 Chapter 7 CONCLUSION 32 REFERENCES 33 | |
| dc.language.iso | en | |
| dc.title | 基於對抗式適應之跨文化音樂情緒辨識 | zh_TW |
| dc.title | Cross-Cultural Music Emotion Recognition by Adversarial Discriminative Domain Adaptation | en |
| dc.type | Thesis | |
| dc.date.schoolyear | 106-2 | |
| dc.description.degree | 碩士 | |
| dc.contributor.oralexamcommittee | 楊奕軒,蘇黎,蔡銘峰,王釧茹 | |
| dc.subject.keyword | 跨文化,音樂情緒辨識,音樂資訊檢索,域適應,對抗式判別之域適應, | zh_TW |
| dc.subject.keyword | Cross-cultural,music emotion recognition,music information retrieval,domain adaptation,adversarial discriminative domain adaptation, | en |
| dc.relation.page | 37 | |
| dc.identifier.doi | 10.6342/NTU201802248 | |
| dc.rights.note | 同意授權(全球公開) | |
| dc.date.accepted | 2018-08-18 | |
| dc.contributor.author-college | 電機資訊學院 | zh_TW |
| dc.contributor.author-dept | 電信工程學研究所 | zh_TW |
| dc.date.embargo-lift | 2023-08-20 | - |
| 顯示於系所單位: | 電信工程學研究所 | |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| ntu-107-1.pdf | 3.1 MB | Adobe PDF | 檢視/開啟 |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
