歌聲檢索系統：改良式發端識別以及修正式旋律比對

Che-Ming Hu; 胡哲銘

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/10702

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	丁建均(Jian-Jiun Ding)
dc.contributor.author	Che-Ming Hu	en
dc.contributor.author	胡哲銘	zh_TW
dc.date.accessioned	2021-05-20T21:51:19Z	-
dc.date.available	2013-08-13
dc.date.available	2021-05-20T21:51:19Z	-
dc.date.copyright	2010-08-13
dc.date.issued	2010
dc.date.submitted	2010-07-29
dc.identifier.citation	REFERENCE A. Query-by-Humming System [1] S Pauws, “CubyHum: a fully Operational Query by Humming System,” in Proc. of ISMIR, pp. 187-196, Citeseer, 2002. [2] A. Ghias, J. Logan, D. Chamberlain, and B.C. Smith,, “Query By Humming: Musical Information Retrieval in an Audio Database,” in Proc. of the ACM international Multimedia conference and exhibition, pp. 231-236, San Francisco, California, Nov. 1995. [3] N. Kosugi, Y. Nishihara, S. Kon'ya, M. Yamamuro, and K. Kushima, “Music Retrieval by Humming,” in Proc. of PACRIM'99. IEEE, Aug. 1999. B. Onset Detection [4] A. Klapuri, “Sound Onset Detection by Applying Psychoacoustic Knowledge,” in Proc. of IEEE International Conference on Acoustics, Speech and Signal, 1999. [5] A. M. Barbancho, A. Jurado, I. Barbancho, “Identification of Rhythm and Sound in Polyphonic Piano Recordings,” in Proc. of Forum on Acousticum, Sevilla, 2002. [6] C. Duxbury, M. Sandler, and M. Davies, “A hybrid approach to musical note onset detection,” in Proc. of the 5th International Conference on Digital Audio Effects, pp. 33–38, 2002. [7] J. Foote, “Automatic audio segmentation using a measure of audio novelty,” in Proc. of IEEE International Conference on Multimedia and Expo, issue 1, pp. 452–455, 1999 [8] P. Masri, and A. Bateman, ” Improved modelling of attack transients in music analysis-resynthesis,” in Proc. of International Computer Music Conference (ICMC 96), Hong-Kong, Aug 1996, [9] P. Bello J, G. Monti, M. Sandler, “Techniques for Automatic Music Transcription,” in Proc. of International Symposium on MIR, Polymouth, Massachusetts, 2000. C. Pitch Estimation [10] A. de Cheveigne and H. Kawahara, “Yin, a fundamental frequency estimator for speech and music,” in Proc. of Acoust. Soc. Am., vol, 111, Issue, 4 pp. 1917-1930, April 2002. [11] A. Klapuri, “Multipitch analysis of polyphonic music and speech signals using an auditory model,” in Proc. of IEEE Trans. Audio, Speech and Language, vol. 16, pp.255–266, February 2008. [12] D. Hermes, “Measurement of pitch by subharmonic summation,” in Proc. of JASA, vol. 83, no. 1, pp. 257–273, 1988. [13] D.J. Levitin, ”Absolute memory for musical pitch: Evidence from the production of learned melodies,” Perception & Psychophysics, vol. 58, pp. 927-935. 1994 [14] E. S. Tsau, N Cho, CCJ Kuo, “Fundamental Frequency Estimation for Music Signals with Modified Hilbert-Huang Transform,” in Proc. of ICME’09, June, 2009. [15] E. Pollastri, ”Melody-retrieval based on pitch-tracking and string-matching methods,” in Proc. of the 12th Colloquium on Musical Informatics, 1999. [16] H. Hajimolahoseini, M.R. Taban, Abutalebi, H.R. “Automatic Transcription of Music Signal Using Harmonic Elimination Method,” Telecommunications, 2008. [17] L.R. Rabiner, and B. Juang, ”Fundamentals of Speech Recognition,” Prentice-Hall Inc. 1993. [18] N.E. Huang, “The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis,” et al. (1998). D. Melody Matching [19] Z.K. Peynirçioglu, A.I. Tekcan, J.L. Wagner, T.L. Baxter, and S.D. Shaffer, “Name or hum that tune: Feeling of knowing for music,” Memory & Cognition, vol. 26, issue. 6, pp. 1131-1137, 1998 [20] B. Pardo, WP Birmingham and J. Shifrin, 'Name that Tune: A Pilot Study in Finding a Melody from a Sung Query', in JASA. on Information Science and Technology, vol. 55, pp. 283-300, 2000 [21] E. Ukkonen, “Approximate string matching with qgrams and maximal matches. Theoretical Computer Science,” vol. 92, issue 1, pp. 191-211, 1992 [22] G. Myers “A fast bit-vector algorithm for approximate string matching based on dynamic programming,” Journal of the ACM, vol. 46, issue. 4, pp. 395-415, 1999 [23] J.S. R. Jang and H.R. Lee, “Hierarchical Filtering method for content-based music retrieval via acoustic input,”, in Proc. of the 9th ACM international conference on Multimedia, pp. 401-410, ACM Press, 2001. [24] M. Mongeau, D. and Sankoff, “Comparison of musical sequences,” Computers and the Humanities,” vol. 24, pp. 161-175, 1990 [25] P.H. Sellers, “The theory and computation of evolutionary distances: Pattern recognition,” Journal of Algorithms, issue 1, pp. 359-373. 1980 [26] R. B. Dannenberg, “Understanding Search Performance in Query by Humming Systems,” in Proc. of ISMIR, Citeseer, 2004. [27] R.A. Wagner, and M.J. Fischer, “ The string-to-string correction problem, “Journal of the Association of Computing Machinery, vol. 21, issue 1, pp.168-173. 1974 [28] R. Baeza-Yates, and G. Navarro,” Faster approximate string matching,” Algorithmica, vol. 23, issue 2, pp. 127-158, 1999. [29] R.J. McNab, L.A. Smith, I.H. Witten, and C.L. Henderson, “Tune retrieval in the multimedia library, Multimedia Tools and Applications,” vol. 10, pp. 113-132, 2000 [30] T. Nishimura, H. Hashiguchi, J. Takita, J.X. Zhang, M. Goto, and R. Oka, “Music Signal Spotting Retrieval by a Humming Query Using Start Frame Feature Dependent Continuous Dynamic Programming,” in proc. of ISMIR, pp. 211-218, Bloomington, Indiana, 2001. [31] W. Chang, and E. Lawler ”Sublinear approximate string matching and biological applications,” Algorithmica, vol. 12,issue. 4/5, pp. 327-344. 1994 [32] W.L. Dowling, “Scale and Contour: Two components of a theory of memory for melodies.,” Psychological Review, vol. 85, issue. 4, pp. 341-354. 1978 [33] W.J. Dowling, and D.L. Harwood,”Music cognition.,” New York: Academic Press, 1986 [34] W. Schloss, “On the Automatic Transcription of Percussive Music: From Acoustic Signal to High Level Analysis,” PhD Thesis, Department of Music, Report No. STAN-M-27, Stanford University, CCRMA. 1985 [35] Y. Zhu, and M. Kankanhalli, “Music Scale Modeling for Melody Matching,” In Proc. of 11th ACM international conference on Multimedia, pp. 359~362, Nov. 2003. [36] Y. Zhu and D. Shasha, ”Query by humming: a time series database approach,” In Proc. of SIGMOD, 2003. [37] Y.W. Zhu, M. Kankanhalli, “A Robust Music Retrieval Method for Query-by-Humming”, In International Conference on Information Technology: Research and Technology, New Jersey, USA, Aug. pp. 10-13, 2003 E. Others [38] E.D. Scheirer , “Tempo and Beat Analysis of Musical Signals,” In J. Acoust. Soc. Am, vol. 103, Issue 1, pp. 588-601, January 1998. [39] J.C. Brown, and J.C. Vaughn, ”Pitch center of stringed instrument vibrato tones,” In J. Acoust. Soc. Am, , vol. 100, pp. 1728-1735, 1996
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/10702	-
dc.description.abstract	近幾年, 音樂檢索的技術開始被深入的研究。傳統上，我們使用歌名或歌手等以文字為基礎的方式來進行音樂檢索。然而，假如我們忘記歌名跟歌手等文字線索那要怎麼辦呢？因此，接下來我們將介紹一個新穎的觀念，利用歌曲中的旋律內容來進行檢索。哼唱詢問(QBH)是人們跟電腦透過網際網路進行音樂搜尋的互動式觀念，而在一個龐大的音樂資料庫底下，此搜尋結果需要快速的搜尋時間。　　大致上來說，我們嘗試著取出一段哼唱旋律的音頻並且比較其音頻在音樂資料庫的最大相似度。在資料庫中，高相似度的曲目將會被列出並且依照相似度由高往低排列。這篇論文將會介紹發端識別，音頻偵測，以及旋律比對的演算法。然而，由人們產生的歌聲總是不完美的，因此，仍然有許多困難之處留待我們去解決和改善。每一個人有不同的演唱方式且會導致各種不同型態的聲音波形，像是音調的高低，音準的準確性，尾音哼唱方式，演唱方式等等..，而這些因素也造成發端識別上的困難。　　因此，在這篇論文中，我們主要提出了三種改善哼唱詢問(QBH)的方式，一種是針對發端識別做改善，而另外兩種則是針對旋律比對做改善，此外，在音頻偵測的方面，我們也採用了自己的方式。　　最後，未來的研究應該更致力於改善哼唱詢問的效能，使它針對不同演唱風格的情況下能更具可適性以及信賴性。另外，在進行旋律比對方面上，我們要求更快的搜尋時間以及較低的複雜度，特別是在一個龐大的音樂資料庫底下。	zh_TW
dc.description.abstract	Music retrieval techniques have been investigated in recent years. Typically, we use the names of singers or songs as retrievals. However, how do we search songs when we forget the names of singers and songs? Hereunder, we will introduce a novel concept for music retrieval searching by using any melodic passage of a song. ‘Querying by humming’ (QBH) is a interaction concept for people to interact with computer through internet, and the searching results are revealed fast and orderly by comparing the sung input with a large database of known songs. Generally speaking, we try to extract a series of the pitches form the humming input by a single individual, and compare these pitches with pitch interval of the known musical database. Melodies (theme) of the database are similar to the sung input are retrieved and listed orderly depending on its similarity score. This paper will present the algorithm for note onset detection (event detection), pitch detection, pitches quantization, melody encoding and melody matching (similarity matching or pattern matching). However, Human reproduction of melodies is always imperfect, therefore, there are many difficulties for us to overcome and improve. Every individual has different singing style that results as a variety of patterns of sung inputs that make note onset detection become more difficult. Besides, the accuracy rate of onset detection also influences the performance of melody matching. Therefore, in this thesis, we mainly proposed three methods to improve the QBH system, one is for improving onset detection and other two are for improving melody matching. Besides, we use our own method to extract the fundamental frequency. For that, future research should make an effort to improve event detection, making it more adaptive and reliable to deal with various situations. Furthermore, research on measuring similarity between melodies in database and sung theme need to be further pursued to reduce the computation time and complexity while the amounts of musical database have been exploding nowadays.	en
dc.description.provenance	Made available in DSpace on 2021-05-20T21:51:19Z (GMT). No. of bitstreams: 1 ntu-99-R97942093-1.pdf: 1879202 bytes, checksum: 748c79a938fc2da2f3544713eb09fe5a (MD5) Previous issue date: 2010	en
dc.description.tableofcontents	口試委員審訂書 # 誌謝 v 中文摘要 vii ABSTRACT ix CONTENTS xi LIST OF FIGURES xv LIST OF TABLES xix Chapter 1 Introduction 1 1.1 Background 1 1.2 Primary Achievement of This Thesis 2 1.3 Chapter Organization 4 1.4 Abbreviations 6 Chapter 2 Basic Musical Characteristics 7 Chapter 3 Onset Detection 11 3.1 Magnitude Method 13 3.2 Short-Term Energy Method 16 3.2.1 Implementation of Short-Term Energy Method 17 3.2.2 Discussion of Short-Term Energy Method 19 3.3 Surf Method 20 3.3.1 Implementation of Surf Method 21 3.3.2 Discussion of Surf Method 23 3.4 High Frequency Content (HFC) method 24 3.4.1 Implementation of HFC Method 25 3.4.2 Discussion of HFC Method 26 3.5 Pitch method 27 3.6 Discussion between These Onset Detections 29 Chapter 4 Pitch Estimation 33 4.1 Autocorrelation Function for Pitch Tracking 33 4.2 Modified Hilbert-Huang Transform 34 4.2.1 The Over View of HHT 35 4.2.2 Music Pitch Analysis with Modified HHT 37 4.3 Sub-Harmonic Summation 38 4.4 Harmonic Elimination Method 39 4.4.1 Elimination of Harmonics 40 4.4.2 Extended Kalman Filter (EKF) 41 Chapter 5 Melody Matching 43 5.1 Dynamic Programming Algorithm 43 5.1.1 Implementation of Dynamic Programming Algorithm 44 5.1.2 Complexity of Dynamic Programming 46 5.2 Hidden Markov Model 47 5.2.1 Implementation of Hidden Markov Models 49 5.2.2 Complexity of Hidden Markov Model 54 Chapter 6 Representations for Musical Feature 57 6.1 Music Scale Modeling for Melody Matching 58 6.2 Pitch Interval and IOI for Melody Matching 61 6.3 Construction of Music Database 63 Chapter 7 Proposed Algorithm 65 7.1 Proposed Onset Detection 69 7.2 Proposed Pitch Estimation 75 7.3 Proposed Melody Matching 78 7.3.1 Modified Dynamic Programming 78 7.3.2 Modified Hidden Markov Model 84 Chapter 8 Experimental Result 87 8.1 Performance Standard 87 8.2 Comparison of Onset Detection Performance 88 8.3 Comparison of Melody Matching Performance 91 8.4 Comparison of Whole QBH System Performance 92 Chapter 9 Conclusion and Future Work 97 9.1 Conclusions 97 9.2 Future Work 98 REFERENCE 101
dc.language.iso	en
dc.title	歌聲檢索系統：改良式發端識別以及修正式旋律比對	zh_TW
dc.title	Querying By Humming System： Improved Onset Detection and Modified Melody Matching	en
dc.type	Thesis
dc.date.schoolyear	98-2
dc.description.degree	碩士
dc.contributor.oralexamcommittee	王鵬華,葉敏宏,許文良
dc.subject.keyword	哼唱詢問,發端識別,音頻偵測,旋律比對,	zh_TW
dc.subject.keyword	Querying by humming,Onset detection,Pitch detection,melody matching,	en
dc.relation.page	105
dc.rights.note	同意授權(全球公開)
dc.date.accepted	2010-07-30
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	電信工程學研究所	zh_TW
顯示於系所單位：	電信工程學研究所

文件中的檔案：

檔案	大小	格式
ntu-99-1.pdf	1.84 MB	Adobe PDF	檢視/開啟

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。