Please use this identifier to cite or link to this item:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/67014
Full metadata record
???org.dspace.app.webui.jsptag.ItemTag.dcfield??? | Value | Language |
---|---|---|
dc.contributor.advisor | 鄭卜壬 | |
dc.contributor.author | Siao-Yun Dai | en |
dc.contributor.author | 戴筱芸 | zh_TW |
dc.date.accessioned | 2021-06-17T01:17:17Z | - |
dc.date.available | 2020-08-24 | |
dc.date.copyright | 2017-08-24 | |
dc.date.issued | 2017 | |
dc.date.submitted | 2017-08-12 | |
dc.identifier.citation | REFERENCE
[1] C. Laurier, J. Grivolla and P. Herrera, 'Multimodal Music Mood Classification Using Audio and Lyrics,' in ICMLA, 2008. [2] R. Neumayer and A. Rauber, 'Integration of Text and Audio Features for Genre Classification in Music Information Retrieval,' in ECIR, 2007. [3] R. Mayer, R. Neumayer and A. Rauber, 'Rhyme and Style Features for Musical Genre Classification by Song Lyrics,' in ISMIR, 2008. [4] R. Mayer, R. Neumayer and A. Rauber, 'Combination of audio and lyrics features for genre classification in digital audio collections,' in ACM Multimedia, 2008. [5] R. Mayer and A. Rauber, 'Music Genre Classification by Ensembles of Audio and Lyrics Features,' in ISMIR, 2011. [6] C.C.Pratt, Music as the language of emotion, Library of Congress, 1950. [7] Y. E. Kim, E. M. Schmidt, R. Migneco, B. G. Morton, P. Richardson, J. J. Scott, J. A. Speck and D. Turnbull, 'Music Emotion Recognition: A State of the Art Review,' in ISMIR, 2010. [8] J. A. Russell, 'A circumspect model of affect,' Journal of Psychology and Social Psychology, p. 1161, 1980. [9] X. Hu and J. S. Downie, 'Improving mood classification in music digital libraries by combining lyrics and audio,' in JCDL, 2010. [10] G. Tzanetakis, MARSYAS SUBMISSIONS TO MIREX 2007, 2007. [11] Y. LeCun, B. E. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. E. Hubbard and L. D. Jackel, 'Backpropagation Applied to Handwritten Zip Code Recognition,' in Neural Computation, 1989. [12] B. David M, N. Andrew Y and J. Michael I, 'Latent Dirichlet Allocation,' Journal of Machine Learning Research, 2003. [13] F. Kleedorfer, P. Knees and T. Pohle, 'Oh Oh Oh Whoah! Towards Automatic Topic Detection In Song Lyrics,' in ISMIR, 2008. [14] L. Sterckx, T. Demeester, J. Deleu, L. Mertens and C. Develder, 'Assessing Quality of Unsupervised Topics in Song Lyrics,' in ECIR, 2014. [15] S. Dieleman and B. Schrauwen, 'End-to-end learning for music audio,' in ICASSP, 2014. [16] H. Lee, P. Peter T, Y. Largman and N. Andrew Y, 'Unsupervised feature learning for audio classification using convolutional deep belief networks,' in NIPS, 2009. [17] M. Henaff, K. Jarrett, K. Kavukcuoglu and Y. LeCun, 'Unsupervised Learning of Sparse Features for Scalable Audio Classification,' in IAMIR, 2011. [18] J. Wülfing and R. Martin A, 'Unsupervised Learning of Local Features for Music Classification,' in ISMIR, 2012. [19] S. Dieleman and B. Schrauwen, 'Multiscale Approaches To Music Audio Feature Learning,' in ISMIR, 2013. [20] A. v. d. Oord, S. Dieleman and B. Schrauwen, 'Deep content-based music recommendation,' in NIPS, 2013. [21] S. Ioffe and C. Szegedy, 'Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift,' in ICML, 2015. [22] X. Yi and J. Allan, 'A Comparative Study of Utilizing Topic Models for Information Retrieval,' in ECIR, 2009. [23] J.-Y. Liu and Y.-H. Yang, 'Event Localization in Music Auto-tagging,' in ACM Multimedia, 2016. | |
dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/67014 | - |
dc.description.abstract | Nowadays, music has become an import part of our lives. As cloud-based streaming service becomes popular, people are more dependent on music. Music as a tool of expressing emotions, it is rich in semantics. In previous genre and mood classification tasks, some people already show that combining lyrics and audio features can improve the results. Their research indicates there are potential relationship between audio and lyrics. Lyrics directly describe a song’s topic, while audio can expand the emotions. Nevertheless, lyrics can be incomplete or missing. If we can learn the topics from audio, we can guess the possible topics for a song without using lyrics. We proposed an unsupervised two-stage method. First, we learn the latent topics in lyrics by topic model. Second, we transfer audio signal to topic distribution via a convolutional neural network. We show that this framework can indeed learns a semantical representation from audio and can be directly applied to song retrievals. We can not only search the songs with lyrics. For those songs without lyrics, i.e. classical songs, we can also provide a reasonable result. | en |
dc.description.provenance | Made available in DSpace on 2021-06-17T01:17:17Z (GMT). No. of bitstreams: 1 ntu-106-R04922064-1.pdf: 6584558 bytes, checksum: 272794705114724773cbf1f4a28a022e (MD5) Previous issue date: 2017 | en |
dc.description.tableofcontents | CONTENTS
誌謝 ii 摘要 iii Abstract iv CONTENTS v List of Figures vii List of Tables ix Chapter 1 Introduction 1 Chapter 2 Related Work 3 2.1 Genre Classification 3 2.2 Mood Classification 4 Chapter 3 Methodology 6 3.1 Topic Model and Lyrics 7 3.1.1 Topic Model: Latent Dirichlet Allocation 7 3.1.2 Topic modeling on Lyrics 8 3.2 Preprocessing 9 3.3 Audio Feature Extraction 10 3.4 Convolutional Neural Network 11 3.4.1 Input Layer 12 3.4.2 Convolution and Max-pooling Layers 13 3.4.3 Fully Connected Layers 13 3.4.4 Output Layer 13 3.4.5 Loss Functions 13 Chapter 4 Experiment 15 4.1 Dataset 16 4.2 Offline Evaluation for CNN 16 4.2.1 Test Case Sampling 17 4.2.2 Experiment Result 17 4.2.3 Discovery 18 Chapter 5 Application 23 5.1.1 Search Songs by Song 23 5.1.2 Search Songs by Term 26 Chapter 6 Conclusion and Future Work 29 REFERENCE 30 | |
dc.language.iso | en | |
dc.title | 從音訊到主題:用卷積神經網路學習語意 | zh_TW |
dc.title | From Audio to Topics: Learning Semantics with Convolutional Neural Network | en |
dc.type | Thesis | |
dc.date.schoolyear | 105-2 | |
dc.description.degree | 碩士 | |
dc.contributor.oralexamcommittee | 邱志義,高宏宇,陳怡安 | |
dc.subject.keyword | 卷積神經網路,隱含狄利克雷分布,主題模型,音訊, | zh_TW |
dc.subject.keyword | Convolutional Neural Network,LDA,topic model,audio signal, | en |
dc.relation.page | 32 | |
dc.identifier.doi | 10.6342/NTU201702967 | |
dc.rights.note | 有償授權 | |
dc.date.accepted | 2017-08-14 | |
dc.contributor.author-college | 電機資訊學院 | zh_TW |
dc.contributor.author-dept | 資訊工程學研究所 | zh_TW |
Appears in Collections: | 資訊工程學系 |
Files in This Item:
File | Size | Format | |
---|---|---|---|
ntu-106-1.pdf Restricted Access | 6.43 MB | Adobe PDF |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.