Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 電信工程學研究所
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/19357
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor丁建均(Jian-Jiun Ding)
dc.contributor.authorChiao-Wei Linen
dc.contributor.author林巧薇zh_TW
dc.date.accessioned2021-06-08T01:55:13Z-
dc.date.copyright2016-08-02
dc.date.issued2016
dc.date.submitted2016-07-12
dc.identifier.citationA. Query by Humming System
[1] A. Ghias, J. Logan, D. Chamberlin et al., 'Query by humming: musical information retrieval in an audio database,' Proceedings of the third ACM international conference on Multimedia. ACM, pp. 231-236, 1995.
[2] N. Kosugi, Y. Nishihara, S. Kon et al., 'Music retrieval by humming-using similarity retrieval over high dimensional feature vector space.' Communications, Computers and Signal Processing, 1999 IEEE Pacific Rim Conference on. IEEE, pp. 404-407, 1999.
[3] S. Pauws, 'CubyHum: a fully operational query by humming system,' ISMIR, pp. 187-196, 2002.
B. Onset Detection
[4] S. Hainsworth, and M. Macleod, 'Onset detection in musical audio signals,' Proc. Int. Computer Music Conference, pp. 163-6, 2003.
[5] J. P. Bello, L. Daudet, S. Abdallah et al., “A tutorial on onset detection in music signals,” Speech and Audio Processing, IEEE Transactions on, vol. 13, no. 5, pp. 1035-1047, 2005.
[6] J.-J. Ding, C.-J. Tseng, C.-M. Hu et al., 'Improved onset detection algorithm based on fractional power envelope match filter,' Signal Processing Conference, 2011 19th European. IEEE, pp. 709-713, 2011.
[7] J. P. Bello, and M. Sandler, 'Phase-based note onset detection for music signals,'Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP'03). 2003 IEEE International Conference on. IEEE, vol. 5, pp. V-441-4, 2003.
[8] S. Abdallah, and M. D. Plumbley, 'Unsupervised onset detection: a probabilistic approach using ICA and a hidden Markov classifier,' Cambridge Music Processing Colloquium, 2003.
[9] A. Klapuri, 'Sound onset detection by applying psychoacoustic knowledge,' Acoustics, Speech, and Signal Processing, 1999. Proceedings, 1999 IEEE International Conference on. IEEE, vol. 6, pp. 3089-3092, 1999.
C. Pitch Estimation
[10] M. R. Schroeder, “Period Histogram and Product Spectrum: New Methods for Fundamental‐Frequency Measurement,” The Journal of the Acoustical Society of America, vol. 43, no. 4, pp. 829-834, 1968.
[11] L. Rabiner, M. J. Cheng, A. E. Rosenberg et al., “A comparative performance study of several pitch detection algorithms,” Acoustics, Speech and Signal Processing, IEEE Transactions on, vol. 24, no. 5, pp. 399-418, 1976.
[12] D. J. Hermes, “Measurement of pitch by subharmonic summation,” The journal of the acoustical society of America, vol. 83, no. 1, pp. 257-264, 1988.
[13] S. Kadambe, and G. F. Boudreaux-Bartels, “Application of the wavelet transform for pitch detection of speech signals,” IEEE Transactions on Information Theory, vol. 38, no. 2, pp. 917-924, 1992.
[14] E. Pollastri, 'Melody-retrieval based on pitch-tracking and string-matching methods,' Proc. Colloquium on Musical Informatics, Gorizia. 1998.
[15] E. Tsau, N. Cho, and C.-C. J. Kuo, 'Fundamental frequency estimation for music signals with modified Hilbert-Huang transform (HHT),' Multimedia and Expo, 2009. ICME 2009. IEEE International Conference on. IEEE, pp. 338-341, 2009.
[16] M. J. Ross, H. L. Shaffer, A. Cohen et al., “Average magnitude difference function pitch extractor,” Acoustics, Speech and Signal Processing, IEEE Transactions on, vol. 22, no. 5, pp. 353-362, 1974.
[17] N. E. Huang, Z. Shen, S. R. Long et al., 'The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis,' Proceedings of the Royal Society of London A: Mathematical, Physical and Engineering Sciences, vol. 454, no. 1971, pp. 903-995, 1998.
[18] X.-D. Mei, J. Pan, and S.-h. Sun, 'Efficient algorithms for speech pitch estimation,' Intelligent Multimedia, Video and Speech Processing, 2001. Proceedings of 2001 International Symposium on. IEEE, pp. 421-424, 2001.
[19] A. M. Noll, “Cepstrum pitch determination,” The journal of the acoustical society of America, vol. 41, no. 2, pp. 293-309, 1967.
D. Melody Matching
[20] J. Shifrin, and W. Birmingham, “Effectiveness of HMM-based retrieval on large databases,” Ann Arbor, vol. 1001, pp. 48109-2110, 2003.
[21] Y. Kim, and C. H. Park, 'Query by Humming by Using Scaled Dynamic Time Warping,' Signal-Image Technology & Internet-Based Systems (SITIS), 2013 International Conference on, pp. 1-5, IEEE, 2013.
[22] G. P. Nam, and K. R. Park, “Fast query-by-singing/humming system that combines linear scaling and quantized dynamic time warping algorithm,” International Journal of Distributed Sensor Networks, vol. 2015, 2015.
[23] J.-S. R. Jang, and M.-Y. Gao, 'A query-by-singing system based on dynamic programming,' Proceedings of international workshop on intelligent system resolutions (8th bellman continuum), Hsinchu , pp. 85-89, 2000.
[24] S. Doraisamy, and S. Rüger, “Robust polyphonic music retrieval with n-grams,” Journal of Intelligent Information Systems, vol. 21, no. 1, pp. 53-70, 2003.
[25] J.-S. R. Jang, H.-R. Lee, and M.-Y. Kao, 'Content-based music retrieval using linear scaling and branch-and-bound tree search,' ICME. Vol. 1, pp. 289-292, 2001.
[26] J. Yang, J. Liu, and W. Zhang, 'A fast query by humming system based on notes,' INTERSPEECH, pp. 2898-2901, 2010.
[27] W.-H. Tsai, and Y.-M. Tu, 'An Efficient Query-by-Singing/Humming System Based on Fast Fourier Transforms of Note Sequences,' Multimedia and Expo (ICME), 2012 IEEE International Conference on. IEEE, pp. 521-525, 2012.
[28] R. Bellman, “Dynamic programming and Lagrange multipliers,” Proceedings of the National Academy of Sciences of the United States of America, vol. 42, no. 10, pp. 767, 1956.
[29] L. R. Rabiner, “A tutorial on hidden Markov models and selected applications in speech recognition,” Proceedings of the IEEE, vol. 77, no. 2, pp. 257-286, 1989.
[30] G. P. Nam, T. T. T. Luong, H. H. Nam et al., “Intelligent query by humming system based on score level fusion of multiple classifiers,” EURASIP Journal on Advances in Signal Processing, vol. 2011, no. 1, pp. 1-11, 2011.
E. Others
[31] Jyh-Shing Roger Jang, 'MIR-QBSH Corpus', MIR Lab, CS Dept., Tsing Hua Univ., Taiwan. Available at the 'MIR-QBSH Corpus' link at http://www.cs.nthu.edu.tw/~jang.
[32] E.D. Scheirer, “Tempo and Beat Analysis of Musical Signals,” The Journal of the Acoustic Society of America, vol. 103, Issue 1, pp. 588-601, January 1998.
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/19357-
dc.description.abstract近年來語音系統運用十分廣泛,透過從聲音訊號中分析、擷取特徵,進而到合成壓縮等。應用在哼唱檢索系統,可以不必透過傳統的歌名、歌手或歌詞等已知文字資訊,而改以歌曲的旋律節奏資訊搜尋。在現今音樂資料庫龐大的情況下,此搜尋系統應在短時間內結束搜尋,因此需要兼顧搜尋效率及效果。
一個哼唱檢索系統主要包含三部份:輸入端、資料庫與系統演算法。輸入端是系統所接收的輸入,在此系統中通常是單純的歌聲;資料庫建立可比對的歌曲資料,提供可能的檢索曲目;系統演算法需先擷取輸入端特徵,並將之轉換為可比對的格式,如音高及音符長度,再運用比對演算法比對輸入序列與資料庫。
哼唱檢索系統中利用比對演算法逐一比對輸入信號與資料庫中的歌曲,並依照相似分數排序給出可能的歌曲清單。然而大部分的人無法完全依照原曲唱出正確的旋律,且每個人的唱腔也有所不同,因此一個好的系統應能盡所能的應付這些可能發生的狀況。
一般而言,比對方式大致分為二種:音符導向或音框導向,前者不論資料庫及輸入訊號都以音符為單位,其優勢是效率較高;後者以音框為單位,在比對上一般而言較為有效。在這篇論文中,我們使用音符導向的比對系統,提出一個發端辨識的改進方法,也改進旋律比對方法來改善哼唱比對系統。此外,我們也使用自己提出的方法做音高偵測。
在未來研究的方向上,希望能更進一步改善哼唱檢索的效率與效能,特別是在現實世界中,資料庫中歌曲的數量事相當龐大的,因此更要求運算速度。在發端辨識上,雖然我們的方法已能達到不錯的效果,但仍有改進空間,希望未來研究能達到更好的準確率。
zh_TW
dc.description.abstractIn recent years, voice signal system has been used in a wide range. The techniques include voice signal analysis, feature extraction, voice synthesis and compression. Applying these techniques to the query by humming (QBH) system, we do not need to know the traditional concrete descriptions like song name, singer or lyrics. Instead, we can use the melody and tempo information to search a song. The music database nowadays is large and the search result should be revealed in a short time, so the system efficiency is as important as the effect.
A QBH system includes three parts: input data, music database and matching algorithm. Input data of the QBH system is usually the humming or singing from human. The database is a collection of songs provided for search. The system first extract the features of input signal like pitch and length of notes, then use the matching algorithm to compare with the songs in database.
The QBH system compares the input signal and database to list a ranking of possible songs by the matching score. However, people usually cannot sing perfectly just as the reference song. Also the singing style is different from person to person. A good QBH system should be able to deal with all possible problems for amateur singing.
Generally speaking, the melody matching at least two types: the note-based and the frame-based method. The advantage of note-based system is its efficiency while the frame-based system is more effective. In this paper, we use the note-based method. We proposed an advanced onset detection and improved melody matching system to improve the QBH system. Besides, we use our own pitch estimation method to estimate the fundamental frequency.
In our experiment, we show the fact that our proposed onset detection has the best performance than other methods. However, the future work should make an effort to improve the system efficiency and effect further since the database in the real world is huge.
en
dc.description.provenanceMade available in DSpace on 2021-06-08T01:55:13Z (GMT). No. of bitstreams: 1
ntu-105-R03942079-1.pdf: 3307714 bytes, checksum: 2018ae5484e06616b028d010d402ba69 (MD5)
Previous issue date: 2016
en
dc.description.tableofcontents中文摘要 i
ABSTRACT ii
CONTENTS iv
LIST OF FIGURES vii
LIST OF TABLES x
Chapter 1 Introduction 1
1.1 Background 1
1.2 Musical Characteristics and Terms 3
1.3 Chapter Organization 8
Chapter 2 System Structure of QBH System 11
2.1 Note-based Approaches 12
2.1.1 Onset Detection 13
2.1.2 Difference of Magnitude Method 14
2.1.3 Short-term Energy Method 15
2.1.4 Surf Method 18
2.1.5 Envelope Match Filter 19
2.2 Frame-based Approaches 21
2.3 Pitch Estimation 23
2.3.1 Autocorrelation Function 24
2.3.2 Average Magnitude Difference Function 24
2.3.3 Harmonic Product Spectrum 25
Chapter 3 Melody Matching 27
3.1 Dynamic Programming [28] 27
3.2 Hidden Markov Model [29] 29
3.3 Linear Scaling [25] 33
3.4 Note-based Linear Scaling and Recursive Align [26] 35
3.5 FFT-based Method [27] 37
3.6 Quantized Binary Code-based Linear Scaling [30] 39
Chapter 4 Proposed Algorithm 42
4.1 Proposed Onset Algorithm 43
4.2 Proposed Pitch Estimation 49
4.3 Proposed Melody Matching 53
4.3.1 Proposed Hidden Markov Model 56
4.3.2 Modified Dynamic Programming 62
4.3.3 Music Indexing 69
4.3.4 Progressive Filtering and Score Level Fusion 70
Chapter 5 Experiment Result 73
5.1 Music Dataset 73
5.2 Performance Parameter 74
5.3 Comparison of Onset Detection Performance 76
5.4 Comparison of Melody Matching Performance 81
Chapter 6 Conclusion and Future Work 86
6.1 Conclusion 86
6.2 Future Work 87
REFERENCE 89
dc.language.isoen
dc.title應用節奏與頻率資訊之改良式哼唱檢索系統及改良式發端偵測與旋律匹配zh_TW
dc.titleImproved Query by Humming System Using the Tempo and Frequency Information and Advanced Onset Detection and Melody Matching Methodsen
dc.typeThesis
dc.date.schoolyear104-2
dc.description.degree碩士
dc.contributor.oralexamcommittee郭景明(Jing-Ming Guo),葉敏宏(Min-Hung Yeh),簡鳳村(Feng-Tsun Chien)
dc.subject.keyword哼唱檢索,發端辨識,音頻偵測,旋律比對,隱藏式馬可夫模型,zh_TW
dc.subject.keywordQuery by humming,beat,onset detection,pitch estimation,melody matching,hidden Markov model,en
dc.relation.page94
dc.identifier.doi10.6342/NTU201600831
dc.rights.note未授權
dc.date.accepted2016-07-12
dc.contributor.author-college電機資訊學院zh_TW
dc.contributor.author-dept電信工程學研究所zh_TW
顯示於系所單位:電信工程學研究所

文件中的檔案:
檔案 大小格式 
ntu-105-1.pdf
  未授權公開取用
3.23 MBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved