Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 工學院
  3. 工程科學及海洋工程學系
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/80912
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor黃乾綱(Chien-Kang Huang)
dc.contributor.authorChia-Hsin Leeen
dc.contributor.author李嘉欣zh_TW
dc.date.accessioned2022-11-24T03:21:38Z-
dc.date.available2021-11-08
dc.date.available2022-11-24T03:21:38Z-
dc.date.copyright2021-11-08
dc.date.issued2021
dc.date.submitted2021-09-24
dc.identifier.citation1. Chris Harte, Towards automatic extraction of harmony information from music signals. 2010. 2. John Ashley Burgoyne, Jonathan Wild, and Ichiro Fujinaga, An Expert Ground Truth Set For Audio Chord Recognition And Music Analysis, in Proc. Int. Soc. Music Inf. Retrieval Conf. 2011. p. 633–638. 3. Masataka Goto, Hiroki Hashiguchi, Takuichi Nishimura, and Ryuichi Oka, RWC music database: Popular, classical, and Jazz music databases, in Proc. Int. Soc. Music Inf. Retrieval Conf. 2002. p. 287–288. 4. Bruce Benward and Marilyn Saker, Music: In Theory and Practice. seventh ed. Vol. 1. 2003. 67, 359. 5. Meinard Müller, Fundamentals of Music Processing. 2015: Springer International Publishing. XXIX, 487. 6. Christopher Harte, Mark B Sandler, Samer A Abdallah, and Emilia Go ́Mez, Symbolic Representation of Musical Chords: A Proposed Syntax for Text Annotations, in Proc. Int. Soc. Music Inf. Retrieval Conf. 2005. p. 66-71. 7. Eric J Humphrey and Juan Pablo Bello, Rethinking Automatic Chord Recognition with Convolutional Neural Networks. Proc. Int. Conf. Mach. Learn. Appl, 2012: p. 357–362. 8. Xinquan Zhou and Alexander Lerch, Chord Detection Using Deep Learning. Proc. Int. Soc. Music Inf. Retrieval Conf, 2015. 9. Filip Korzeniowski and Gerhard Widmer, A Fully Convolutional Deep Auditory Model for Musical Chord Recognition. In 2016 IEEE 26th International Work- shop on Machine Learning for Signal Processing (MLSP), 2016: p. 1-6. 10. Yiming Wu and Wei Li, Automatic Audio Chord Recognition With MIDI-Trained Deep Feature and BLSTM-CRF Sequence Decoding Model. IEEE International Conference on Acoustics, Speech and Signal Processing, 2019. 27. 11. Christon-Ragavan Nadar, Jakob Abeßer, and Sascha Grollmisch, Towards CNN-based Acoustic Modeling of Seventh Chords for Automatic Chord Recognition. 16th Sound Music Computing Conference, 2019. 12. Pushparaja Murugan, Feed Forward and Backward Run in Deep Convolution Neural Network. 2017. 13. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun, Deep Residual Learning for Image Recognition. Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2016: p. 770-778. 14. Siddharth Sigtia, Nicolas Boulanger-Lewandowski, and Simon Dixon, Audio Chord Recognition With A Hybrid Recurrent Neural Network, in Proc. Int. Soc. Music Inf. Retrieval Conf. 2015. p. 127–133. 15. N. Boulanger-Lewadowski, Y. Bengio, and P. Vincent, Audio chord recognition with recurrent neural networks. Proc. Int. Soc. Music Inf. Retrieval Conf, 2013: p. 335–340. 16. Junyan Jiang, Ke Chen, Wei Li, and Gus Xia, Large-vocabulary Chord Transcription Via Chord Structure Decomposition. Proc. Int. Soc. Music Inf. Retrieval Conf, 2019. 17. Zhiheng Huang, Wei Xu, and Kai Yu, Bidirectional LSTM-CRF Models for Sequence Tagging. ArXiv., 2015. 18. Takuya Fujishima, Realtime Chord Recognition of Musical Sound: a System Using Common Lisp Music, in Proceedings of the International Computer Music Conference. 1999. p. pages 464–467. 19. Matthias Mauch and Simon Dixon, Approximate Note Transcription For the Improved Identification of Difficult Chords, in Proc. Int. Soc. Music Inf. Retrieval Conf. 2010. p. 135–140. 20. Kyogu Lee and Malcolm Slaney, Automatic Chord Recognition from Audio Using a HMM with Supervised Learning, in Proc. Int. Soc. Music Inf. Retrieval Conf. 2006. 21. Taemin Cho, Improved techniques for automatic chord recognition from music audio signals, in Ph.D. dissertation, New York Univ., New York, NY, USA. 2014. 22. Alexander Sheh and Daniel P.W. Ellis, Chord Segmentation and Recognition using EM-Trained Hidden Markov Models, in Proc. of the 4th International Society for Music Information Retrieval Conference (ISMIR),. 2003: Baltimore, Maryland, USA. 23. R.Chen, W.Shen, A.Srinivasamurthy, and P.Chordia, Chord Recognition Using Duration-explicit Hidden Markov Models. Proc. Int. Soc. Music Inf. Retrieval Conf, 2012: p. 445–450. 24. Junqi Deng and Yu-Kwong Kwok, A Hybrid Gaussian-HMM-Deep Learning Approach for Automatic Chord Estimation with Very Large Vocabulary. Proc. Int. Soc. Music Inf. Retrieval Conf, 2016. 25. Eric J Humphrey and Juan Pablo Bello, Four Timely Insights on Automatic Chord Estimation. Proc. Int. Soc. Music Inf. Retrieval Conf, 2015: p. 673–679. 26. Filip Korzeniowski and Gerhard Widmer, Feature Learning For Chord Recognition: The Deep Chroma Extractor, in Proc. Int. Soc. Music Inf. Retrieval Conf. 2016. p. 37–43. 27. Filip Korzeniowski and Gerhard Widmer, Improved Chord Recognition by Combining Duration and Harmonic Language Models. Proc. Int. Soc. Music Inf. Retrieval Conf, 2018: p. 10–17. 28. Yiming Wu and Wei Li, Music Chord Recognition Based on Midi-Trained Deep Feature and BLSTM-CRF Hybird Decoding. IEEE International Conference on Acoustics, Speech and Signal Processing, 2018: p. 376–380. 29. Alex Graves, Santiago Fernández, and Jürgen Schmidhuber, Bidirectional LSTM Networks for Improved Phoneme Classification and Recognition, in Artificial Neural Networks: Formal Models and Their Applications. 2005. 30. W. Xiong, L. Wu, and Fil Alleva, The Microsoft 2017 Conversational Speech Recognition System. IEEE International Conference on Acoustics, Speech and Signal Processing, 2018. 31. Albert Zeyer, Patrick Doetsch, Paul Voigtlaender, Ralf Schlüter, and Hermann Ney, A comprehensive study of deep bidirectional LSTM RNNS for acoustic modeling in speech recognition. IEEE International Conference on Acoustics, Speech and Signal Processing, 2017. 32. Xuezhe Ma and Eduard Hovy, End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF, in Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. 2016. 33. Emma Strubell, Patrick Verga, David Belanger, and Andrew Mccallum, Fast and Accurate Entity Recognition with Iterated Dilated Convolutions. Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 2017. 34. Brian Mcfee and Juan Pablo Bello., Structured Training For Large-Vocabulary Chord Recognition. Proc. Int. Soc. Music Inf. Retrieval Conf, 2017. 35. Junyan Jiang, Ke Chen, Wei Li, and Guangyu Xia, A Structural Chord Representation for Automatic Large-Vocabulary Chord Transcription. MIREX, 2018. 36. Yizhao Ni, Matt Mcvicar, RaúL Santos-RodríGuez, and Tijl De Bie, Understanding Effects of Subjectivity in Measuring Chord Estimation Accuracy, in IEEE Transactions on Audio, Speech and Language Processing. 2013. 37. Christian Schörkhuber and Anssi Klapuri, Constant-Q Transform Toolbox For Music Processing. Proc. Int. Sound Music Comput. Conf., 2010: p. 3-64. 38. Julian Chen, Elements of Human Voice. 2016: WORLD SCIENTIFIC: Singapore. 39. Brian Mcfee, Colin Raffel, Dawen Liang, Daniel Pw Ellis, Matt Mcvicar, Eric Battenberg, and Oriol Nieto, librosa: Audio and music signal analysis in python. Proceedings of the 14th python in science conference, 2015: p. 18-25. 40. Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov, Dropout: A Simple Way to Prevent Neural Networks from Overfitting. Journal of Machine Learning Research, 2014. 15: p. 1929–1958. 41. George David Forney. The viterbi algorithm. in Proceedings of the IEEE. 1973. 42. Colin Raffel, Brian Mcfee, Eric J Humphrey, Justin Salamon, and Adobe Research, mir_eval: A Transparent Implementation of Common MIR Metrics, in Proc. Int. Soc. Music Inf. Retrieval Conf. 2014. 43. Jonggwon Park, Kyoyun Choi, Sungwook Jeon, Dokyun Kim, and Jonghun Park, A Bi-Directional Transformer For Musical Chord Recognition. Proc. Int. Soc. Music Inf. Retrieval Conf, 2019.
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/80912-
dc.description.abstract和弦辨識的能力在於音樂作曲、音樂彈奏演唱的領域皆是十分重要的技術之一,過去人們多以手工標註的方式來完成,但這樣的方式除了需要耗費大量的勞力和時間外,更需要具備相當的專業音樂知識。因此,本研究提出一自動和弦辨識模型(Automatic Chord Recognition System),通用於辨識小詞彙集及大詞彙集和弦,在提升小詞彙集辨識分數的同時,亦改善在大詞彙集和弦上之表現,其中包括增加辨識的和弦種類以及評估分數之提升。 在現今的自動和弦辨識研究中,使用深度學習的神經網路架構已成為主流,人們可以針對不同的需求,去建立不同的模型。我們在實驗中利用三個流通資料集作為訓練及測試資料,設計了一個以卷積神經網路為基礎的特徵萃取器,加上以雙向長短期記憶神經網路 (bi-directional long short term memory)及條件隨機域 (conditional random fields)設計之解碼模型,分別對於小詞彙集和弦以及大詞彙集和弦進行實驗。其中,在小詞彙集的實驗中,WCSR(Weighted Chord Symbol Recall)分數平均可達到84.3%,與同為使用深度學習架構的模型相比,最高可提升8.8%,顯示了我們所設計模型中的之特徵萃取器,能夠有效地學習到更精準的特徵,且與解碼模型配合時能有效地達到提升辨識率之目的。接著,在大詞彙集的實驗中,我們將評估指標由原本的一個增加到六個,且在擴增可辨識和弦種類的同時,維持原本小詞彙集和弦的辨識率,且在七和弦評估標籤中WCSR分數獲得71.5%,四重音評估標籤中WCSR分數獲得66.1%,與其他模型相比提升約1-2%。 為了達到改善大詞彙集辨識率的目的,我們更加入兩種方法以提升分數。首先,我們針對稀缺的七和弦,加入新的訓練資料,試圖解決在現有資料集中和弦分佈不均的問題,並在七和弦評估標籤中獲得WCSR分數72.1%,較原先提升0.6%。再來,我們撇除掉現今大部分研究所使用的扁平分類概念,回歸到和弦原始的精確定義,針對決定和弦種類的關鍵音符設計一個閾值規制決策法,並用以評估這些複雜的擴展和弦,並在七和弦評估標籤中WCSR分數獲得74.5%,共提升3%,四重音評估標籤中WCSR分數獲得68.4%,共提升2.3%,且同時可辨認轉位和弦,可辨識和弦量提升為原先之三倍。藉由這兩大部分的實驗,有效地驗證了此模型之通用性,並改善大詞彙集和弦之辨識率以及增加可辨識和弦的數量。zh_TW
dc.description.provenanceMade available in DSpace on 2022-11-24T03:21:38Z (GMT). No. of bitstreams: 1
U0001-1709202105281000.pdf: 8855595 bytes, checksum: 9c3c4a52ec3b57c0f704e592fab5b0f1 (MD5)
Previous issue date: 2021
en
dc.description.tableofcontents"口試委員審定書 II 致謝 III 摘要 IV ABSTRACT VI 目錄 VIII 圖目錄 XI 表目錄 XII 第一章 緒論 1 1.1. 研究背景與動機 1 1.2. 研究目的 2 1.3. 研究貢獻 2 1.4. 論文架構 3 第二章 相關文獻探討 4 2.1. 和弦基本介紹 4 2.1.1. 和弦組成規則 4 2.1.2. 標準和弦表示法 7 2.2. 神經網路 9 2.2.1. 卷積式神經網路 (CONVOLUTIONAL NEURAL NETWORK) 9 2.2.2. 深度殘差網路 (DEEP RESIDUAL NETWORK) 11 2.2.3. 循環神經網路 (RECURRENT NEURAL NETWORK) 11 2.3. 自動和弦辨識相關研究 13 2.4. 大詞彙與小詞彙和弦辨識 15 2.5. 流通資料集分布狀況與限制 16 2.5.1. 資料集中和弦分佈不平均 16 2.5.2. 資料標註產生的歧異與主觀性 17 第三章 研究方法 18 3.1. 問題定義及研究目的 18 3.2. 研究流程 19 3.3. 系統架構 19 3.4. 實驗前處理 20 3.4.1. 歌曲對齊 20 3.4.2. 色譜圖轉換 (CHROMAGRAMS) 20 3.5. 模型訓練 23 3.5.1. 特徵萃取器模型 23 3.5.2. 序列解碼模型 27 3.6. 改善大詞彙和弦辨識 31 第四章 實驗結果與討論 34 4.1. 實驗步驟 34 4.2. 實驗環境 34 4.3. 實驗資料集 34 4.3.1. ISOPHONICS, MCGILL BILLBOARD, RWC流通資料集 34 4.3.2. 七和弦之合成資料集 35 4.3.3. 資料基準格式 (GROUND TRUTH) 36 4.4. 評估指標 37 4.5. 實驗結果 38 4.5.1. 小詞彙集實驗結果 38 4.5.2. 大詞彙集實驗結果 40 4.5.3. 改善大詞彙集辨識 41 4.6. 錯誤分析 44 第五章 結論與未來展望 55 5.1. 結論 55 5.2. 未來展望 56 參考文獻 57 "
dc.language.isozh-TW
dc.subject條件隨機域zh_TW
dc.subject自動和弦辨識zh_TW
dc.subject深度學習zh_TW
dc.subject卷積神經網路zh_TW
dc.subject雙向長短期記憶神經網路zh_TW
dc.subjectBi-directional Long Short Term Memory(BLSTM)en
dc.subjectConditional Random Fields(CRF)en
dc.subjectAutomatic Chord Recognition(ACR)en
dc.subjectConvolutional Neural Network(CNN)en
dc.title自動和弦辨識針對大詞彙集改善之研究zh_TW
dc.titleA Study for Improving Large-vocabulary Automatic Chord Recognitionen
dc.date.schoolyear109-2
dc.description.degree碩士
dc.contributor.oralexamcommittee張恆華(Hsin-Tsai Liu),陳柏琳(Chih-Yang Tseng),王正豪
dc.subject.keyword自動和弦辨識,深度學習,卷積神經網路,雙向長短期記憶神經網路,條件隨機域,zh_TW
dc.subject.keywordAutomatic Chord Recognition(ACR),Convolutional Neural Network(CNN),Bi-directional Long Short Term Memory(BLSTM),Conditional Random Fields(CRF),en
dc.relation.page61
dc.identifier.doi10.6342/NTU202103229
dc.rights.note同意授權(限校園內公開)
dc.date.accepted2021-09-27
dc.contributor.author-college工學院zh_TW
dc.contributor.author-dept工程科學及海洋工程學研究所zh_TW
顯示於系所單位:工程科學及海洋工程學系

文件中的檔案:
檔案 大小格式 
U0001-1709202105281000.pdf
授權僅限NTU校內IP使用(校園外請利用VPN校外連線服務)
8.65 MBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved