請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/79798完整後設資料紀錄
| DC 欄位 | 值 | 語言 |
|---|---|---|
| dc.contributor.advisor | 張智星(Jyh-Shing Jang) | |
| dc.contributor.author | Meng-Hsuan Wu | en |
| dc.contributor.author | 吳孟軒 | zh_TW |
| dc.date.accessioned | 2022-11-23T09:11:45Z | - |
| dc.date.available | 2021-08-20 | |
| dc.date.available | 2022-11-23T09:11:45Z | - |
| dc.date.copyright | 2021-08-20 | |
| dc.date.issued | 2021 | |
| dc.date.submitted | 2021-08-16 | |
| dc.identifier.citation | [1] Gino Brunner, Andres Konrad, Yuyi Wang, and Roger Wattenhofer. MIDIVAE: modeling dynamics and instrumentation of music with applications to style transfer. In Emilia Gómez, Xiao Hu, Eric Humphrey, and Emmanouil Benetos, editors, Proceedings of the 19th International Society for Music Information Retrieval Conference, ISMIR 2018, Paris, France, September 2327, 2018, pages 747–754, 2018. [2] HaoWen Dong, WenYi Hsiao, LiChia Yang, and YiHsuan Yang. Musegan: Multitrack sequential generative adversarial networks for symbolic music generation and accompaniment. In Sheila A. McIlraith and Kilian Q. Weinberger, editors, Proceedings of the ThirtySecond AAAI Conference on Artificial Intelligence, (AAAI18), the 30th innovative Applications of Artificial Intelligence (IAAI18), and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI18), New Orleans, Louisiana, USA, February 27, 2018, pages 34–41. AAAI Press, 2018. [3] ChengZhi Anna Huang, Ashish Vaswani, Jakob Uszkoreit, Ian Simon, Curtis Hawthorne, Noam Shazeer, Andrew M. Dai, Matthew D. Hoffman, Monica Dinculescu, and Douglas Eck. Music transformer: Generating music with longterm structure. In 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 69, 2019. OpenReview.net, 2019. [4] YuSiang Huang and YiHsuan Yang. Pop music transformer: Beatbased modeling and generation of expressive pop piano compositions. In Chang Wen Chen, Rita Cucchiara, XianSheng Hua, GuoJun Qi, Elisa Ricci, Zhengyou Zhang, and Roger Zimmermann, editors, MM ’20: The 28th ACM International Conference on 53doi:doi:10.6342/NTU202102219 Multimedia, Virtual Event / Seattle, WA, USA, October 1216, 2020, pages 1180–1188. ACM, 2020. [5] Aäron van den Oord, Sander Dieleman, Heiga Zen, Karen Simonyan, Oriol Vinyals, Alex Graves, Nal Kalchbrenner, Andrew W. Senior, and Koray Kavukcuoglu. Wavenet: A generative model for raw audio. In The 9th ISCA Speech Synthesis Workshop, Sunnyvale, CA, USA, 1315 September 2016, page 125. ISCA, 2016. [6] Soroush Mehri, Kundan Kumar, Ishaan Gulrajani, Rithesh Kumar, Shubham Jain, Jose Sotelo, Aaron C. Courville, and Yoshua Bengio. Samplernn: An unconditional endtoend neural audio generation model. In 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 2426, 2017, Conference Track Proceedings. OpenReview.net, 2017. [7] Prafulla Dhariwal, Heewoo Jun, Christine Payne, Jong Wook Kim, Alec Radford, and Ilya Sutskever. Jukebox: A generative model for music. CoRR, abs/2005.00341, 2020. [8] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. Attention is all you need. In Isabelle Guyon, Ulrike von Luxburg, Samy Bengio, Hanna M. Wallach, Rob Fergus, S. V. N. Vishwanathan, and Roman Garnett, editors, Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 49, 2017, Long Beach, CA, USA, pages 5998– 6008, 2017. [9] TsungPing Chen and Li Su. Functional harmony recognition of symbolic music data with multitask recurrent neural networks. In Emilia Gómez, Xiao Hu, Eric Humphrey, and Emmanouil Benetos, editors, Proceedings of the 19th International Society for Music Information Retrieval Conference, ISMIR 2018, Paris, France, September 2327, 2018, pages 90–97, 2018. [10] PeiChing Li, Li Su, YiHsuan Yang, and Alvin W. Y. Su. Analysis of expressive musical terms in violin using scoreinformed and expressionbased audio features. In Meinard Müller and Frans Wiering, editors, Proceedings of the 16th International 54doi:doi:10.6342/NTU202102219 Society for Music Information Retrieval Conference, ISMIR 2015, Málaga, Spain, October 2630, 2015, pages 809–815, 2015. [11] Diederik P. Kingma and Max Welling. Autoencoding variational bayes. In Yoshua Bengio and Yann LeCun, editors, 2nd International Conference on Learning Representations, ICLR 2014, Banff, AB, Canada, April 1416, 2014, Conference Track Proceedings, 2014. [12] Kyunghyun Cho, Bart van Merrienboer, Çaglar Gülçehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. Learning phrase representations using RNN encoderdecoder for statistical machine translation. In Alessandro Moschitti, Bo Pang, and Walter Daelemans, editors, Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP 2014, October 2529, 2014, Doha, Qatar, A meeting of SIGDAT, a Special Interest Group of the ACL, pages 1724–1734. ACL, 2014. [13] Ian J. Goodfellow, Jean PougetAbadie, Mehdi Mirza, Bing Xu, David WardeFarley, Sherjil Ozair, Aaron C. Courville, and Yoshua Bengio. Generative adversarial networks. CoRR, abs/1406.2661, 2014. [14] Zihang Dai, Zhilin Yang, Yiming Yang, Jaime G. Carbonell, Quoc Viet Le, and Ruslan Salakhutdinov. Transformerxl: Attentive language models beyond a fixedlength context. In Anna Korhonen, David R. Traum, and Lluís Màrquez, editors, Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL 2019, Florence, Italy, July 28 August 2, 2019, Volume 1: Long Papers, pages 2978–2988. Association for Computational Linguistics, 2019. [15] Angelos Katharopoulos, Apoorv Vyas, Nikolaos Pappas, and François Fleuret. Transformers are rnns: Fast autoregressive transformers with linear attention. In Proceedings of the 37th International Conference on Machine Learning, ICML 2020, 1318 July 2020, Virtual Event, volume 119 of Proceedings of Machine Learning Research, pages 5156–5165. PMLR, 2020. [16] Sean Vasquez and Mike Lewis. Melnet: A generative model for audio in the frequency domain. CoRR, abs/1906.01083, 2019. [17] WenYi Hsiao, JenYu Liu, YinCheng Yeh, and YiHsuan Yang. Compound word transformer: Learning to compose fullsong music over dynamic directed hypergraphs. In ThirtyFifth AAAI Conference on Artificial Intelligence, AAAI 2021, ThirtyThird Conference on Innovative Applications of Artificial Intelligence, IAAI 2021, The Eleventh Symposium on Educational Advances in Artificial Intelligence, EAAI 2021, Virtual Event, February 29, 2021, pages 178–186. AAAI Press, 2021. [18] Ari Holtzman, Jan Buys, Li Du, Maxwell Forbes, and Yejin Choi. The curious case of neural text degeneration. In 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 2630, 2020. OpenReview.net, 2020. [19] Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Köpf, Edward Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. Pytorch: An imperative style, high performance deep learning library. In Hanna M. Wallach, Hugo Larochelle, Alina Beygelzimer, Florence d’AlchéBuc, Emily B. Fox, and Roman Garnett, editors, Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, December 814, 2019, Vancouver, BC, Canada, pages 8024–8035, 2019. [20] Colin Raffel and Daniel P. W. Ellis. Intuitive analysis, creation and manipulation of midi data with prettymidi. Proceedings of the 15th International Conference on Music Information Retrieval Late Breaking and Demo Papers, 2014. [21] HaoWen Dong, WenYi Hsiao, and YiHsuan Yang. Pypianoroll: Open source python package for handling multitrack pianorolls. LateBreaking Demos of the 19th International Society for Music Information Retrieval Conference, 2018. | |
| dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/79798 | - |
| dc.description.abstract | 在機器學習的應用上,音樂相關的研究雖然和自然語言處理相似,但舉凡和弦、調性等等音樂的特性,使得音樂處理中有更多可以考慮的面向。音樂生成是音樂處理其中一個題目,主要目的是讓模型能在短時間內自動生成全新的音樂。近年來,隨著機器學習模型的日新月異,音樂生成這塊領域也與之蓬勃發展,不但生成結果更流暢,更能生成複雜也有架構的音樂。 本論文以語言的形式看待音樂,定義音高、音長等音樂中的元素為所謂的音樂事件,依此將音樂表示成一連串的序列,並透過語言模訓練自迴歸模型,最後從預測機率取樣達到音樂生成的結果。我們使用貝多芬鋼琴奏鳴曲之功能和聲資料集作為訓練資料,並利用資料集中專業音樂人所標記之和弦、調性與樂句標籤設計音樂事件加入模型,生成帶有豐富音樂資訊的音樂。此外,我們更進一步透過和弦與調性標籤設計損失函數,使生成之音樂更具音樂性。最後我們設計主觀測試問卷,比較模型中音樂資訊有無對於聽感與樂理上的差異。 | zh_TW |
| dc.description.provenance | Made available in DSpace on 2022-11-23T09:11:45Z (GMT). No. of bitstreams: 1 U0001-0908202117594400.pdf: 2867496 bytes, checksum: 6aaa071fd35cbce35473f10287821074 (MD5) Previous issue date: 2021 | en |
| dc.description.tableofcontents | 致謝 iii 摘要 v Abstract vii 目錄 ix 圖目錄 xiii 表目錄 xv 第一章 緒論 1 1.1 研究簡介與動機 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 研究貢獻 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.3 章節概述 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 第二章 文獻探討 5 2.1 符號音樂自動生成技術 . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.1.1 從音高卷軸的角度看音樂 . . . . . . . . . . . . . . . . . . . . . . 5 2.1.2 從語言的角度看音樂 . . . . . . . . . . . . . . . . . . . . . . . . 8 2.2 音訊音樂自動生成技術 . . . . . . . . . . . . . . . . . . . . . . . . . 9 第三章 資料集簡介 13 3.1 事件 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 3.2 拍與下拍 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 3.3 和弦 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.4 樂句 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 第四章 研究方法 19 4.1 資料前處理 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 4.1.1 事件 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 4.1.1.1 音符資訊 . . . . . . . . . . . . . . . . . . . . . . . . 20 4.1.1.2 調性與和弦資訊 . . . . . . . . . . . . . . . . . . . . 20 4.1.1.3 樂句資訊 . . . . . . . . . . . . . . . . . . . . . . . . 21 4.1.2 音樂事件之合併 . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 4.1.2.1 忽略 . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 4.1.2.2 類別事件 . . . . . . . . . . . . . . . . . . . . . . . . 23 4.1.3 資料擴充與序列分組 . . . . . . . . . . . . . . . . . . . . . . . . 24 4.1.4 事件總結 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 4.2 模型與架構 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 4.2.1 轉換器 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 4.2.1.1 位置編碼 . . . . . . . . . . . . . . . . . . . . . . . . 26 4.2.1.2 多頭注意力機制 . . . . . . . . . . . . . . . . . . . . 27 4.2.1.3 前饋神經網路 . . . . . . . . . . . . . . . . . . . . . . 28 4.2.2 模型整體架構 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 4.3 損失函數 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 4.3.1 調性與和弦損失函數 . . . . . . . . . . . . . . . . . . . . . . . . 30 4.4 訓練參數 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 4.5 資料取樣與後處理 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 4.5.1 開頭 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 4.5.2 取樣方式 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 4.5.3 資料後處理 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 4.6 實作工具 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 第五章 實驗結果 35 5.1 實驗一:調性與和弦損失函數 . . . . . . . . . . . . . . . . . . . . . 35 5.2 實驗二:抄襲原曲程度 . . . . . . . . . . . . . . . . . . . . . . . . . 37 5.3 實驗三:主觀測試問卷 . . . . . . . . . . . . . . . . . . . . . . . . . 38 5.3.1 整體回覆結果 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 5.3.2 學習音樂時間與評分結果關係 . . . . . . . . . . . . . . . . . . . 41 5.3.3 音樂創作經驗與評分結果關係 . . . . . . . . . . . . . . . . . . . 41 5.3.4 音樂表演經驗與評分結果關係 . . . . . . . . . . . . . . . . . . . 42 5.3.5 貝多芬熟悉程度與評分結果關係 . . . . . . . . . . . . . . . . . . 43 5.4 音樂生成結果 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 5.4.1 只有音符事件的模型 . . . . . . . . . . . . . . . . . . . . . . . . 43 5.4.2 包含調性或和弦事件之模型 . . . . . . . . . . . . . . . . . . . . 44 5.4.3 包含調性與和弦事件之模型 . . . . . . . . . . . . . . . . . . . . 45 5.4.4 包含調性、和弦與樂句事件之模型 . . . . . . . . . . . . . . . . 46 第六章 結論與未來展望 49 6.1 結論 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 6.2 未來展望 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 參考文獻 53 | |
| dc.language.iso | zh-TW | |
| dc.subject | 轉換器 | zh_TW |
| dc.subject | 符號音樂自動生成 | zh_TW |
| dc.subject | 音樂事件 | zh_TW |
| dc.subject | 貝多芬鋼琴奏鳴曲集 | zh_TW |
| dc.subject | 調性 | zh_TW |
| dc.subject | 和弦 | zh_TW |
| dc.subject | 樂句 | zh_TW |
| dc.subject | Symbolic music generation | en |
| dc.subject | Transformer | en |
| dc.subject | Phrase | en |
| dc.subject | Chord | en |
| dc.subject | Tonality | en |
| dc.subject | Beethoven Piano Sonata with Function Harmony | en |
| dc.subject | Event-based representation | en |
| dc.title | 考慮調性、和聲與樂句的音樂自動生成 | zh_TW |
| dc.title | "Classical Music Transformer: An Investigation on Tonality, Harmony and Phrase" | en |
| dc.date.schoolyear | 109-2 | |
| dc.description.degree | 碩士 | |
| dc.contributor.coadvisor | 蘇黎(Li Su) | |
| dc.contributor.oralexamcommittee | 楊奕軒(Hsin-Tsai Liu),(Chih-Yang Tseng) | |
| dc.subject.keyword | 符號音樂自動生成,音樂事件,貝多芬鋼琴奏鳴曲集,調性,和弦,樂句,轉換器, | zh_TW |
| dc.subject.keyword | Symbolic music generation,Event-based representation,Beethoven Piano Sonata with Function Harmony,Tonality,Chord,Phrase,Transformer, | en |
| dc.relation.page | 56 | |
| dc.identifier.doi | 10.6342/NTU202102219 | |
| dc.rights.note | 同意授權(全球公開) | |
| dc.date.accepted | 2021-08-16 | |
| dc.contributor.author-college | 電機資訊學院 | zh_TW |
| dc.contributor.author-dept | 資訊工程學研究所 | zh_TW |
| 顯示於系所單位: | 資訊工程學系 | |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| U0001-0908202117594400.pdf | 2.8 MB | Adobe PDF | 檢視/開啟 |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
