以地標為特徵之音訊指紋系統的改進

Wan-Ting Yen; 顏琬庭

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/50686

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	張智星(Jyh-Shing Jang)
dc.contributor.author	Wan-Ting Yen	en
dc.contributor.author	顏琬庭	zh_TW
dc.date.accessioned	2021-06-15T12:52:38Z	-
dc.date.available	2021-08-03
dc.date.copyright	2016-08-03
dc.date.issued	2016
dc.date.submitted	2016-07-19
dc.identifier.citation	[1] SHAZAM website: www.shazam.com. [2] SOUNDHOUND website: www.soundhound.com. [3] Yan Ke, Derek Hoiem, and Rahul Sukthankar, “Computer Vision for Music Identification,” in Proc. of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), San Diego, USA, June,2005. [4] Avery Li-Chun Wang, “An Industrial-Strength Audio Search Algorithm,” in Proceedings of the 4th International Conference on Music Information Retrieval (ISMIR), 2003. [5] Dalwon Jang, Chang D. Yoo, and Sunil Lee, “Pairwise boosted audio fingerprint,” in IEEE Transactions on Information Forensics and Security (TIFS), 2009. [6] Antonio C. Ibarrola, and Edgar Chavez, “A robust entropy-based audio-fingerprint,” in IEEE International Conference on Multimedia and Expo (ICME), 2006. [7] Shumeet Baluja, and Michele Covell, “Content fingerprinting using wavelets,” in World Conference on Innovation, Engineering, and Technology (IET), 2006. [8] Shumeet Baluja and Michele Covell, “Audio Fingerprinting: Combining Computer Vision & Data Stream Processing,” in Proc. of IEEE Conference on Acoustics, Speech and Signal Processing (ICASSP), 2007. [9] Shumeet Baluja and Michele Covell, “Waveprint: Efficient wavelet-based audio fingerprinting,” in Journal of Pattern Recognition, Volume 41, Issue 11, pp 3467-3480, November 2008. [10] Michele Covell, and Shumeet Baluja. “Known-audio detection using waveprint: spectrogram fingerprinting by wavelet hashing,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2007. [11] Shumeet Baluja, andMichele Covell, “Learning to hash: forgiving hash functions and applications,” in Data Mining and Knowledge Discovery, 2008. [12] Job Oostveen, Ton Kalker, and Jaap Haitsma, “Feature extraction and a database strategy for video fingerprinting,” in Springer, 2002. [13] Jin S Seoa, Jaap Haitsmab, Ton Kalkerb, and Chang D Yooa, “A robust image fingerprinting system using the Radon transform,” in Signal Processing: Image Communication, 2004. [14] Jaap Haitsma, and Ton Kalker, “A Highly Robust Audio Fingerprinting System,” in International Conference on Music Information Retrieval (ISMIR), 2002. [15] D. Ellis. Robust Landmark-Based Audio Fingerprinting, web resource: labrosa.ee.columbia.edu/matlab/fingerprint/, 2009. [16] Chung-Che Wang, Jyh-Shing Roger Jang, “Speeding Up Audio Fingerprinting over GPUs,” in Audio, Language and Image Processing (ICALIP), 2014.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/50686	-
dc.description.abstract	本論文中，我們將針對聲紋辨識 (Audio Fingerprinting, AFP)技術做改進。聲紋辨識為一種可以快速檢索音樂的技術，主要用途為原曲辨識，使用者可於任何情況下錄音，以此錄音檔案做為目標，傳入聲紋辨識系統中，最後回報使用者與此錄音檔最相似之歌曲。聲紋辨識系統亦可用於辨識任何以聲音為傳輸媒介之資料，例如：演講、電影音訊等。本論文目標為提升以地標為特徵的音訊聲紋辨識系統，考量地標的能量值，對抽取地標的步驟做改良，假設能量高的高峰點為越有參考價值的特徵，優先配對能量高的高峰點，並且給予能量高的高峰點越大的配對數目。再將最終計算地標分數的權重做調整，結合提出方法得到提高辨識率的改進。	zh_TW
dc.description.abstract	In this thesis, we will improve audio fingerprinting (AFP) systems. Audio fingerprinting is a powerful technology for music retrieval. AFP can be used to identify the most similar audio in the database based on the query. Users can record a query snippet in noisy situation, and input the query snippet to AFP system. AFP system will return the most similar audio. AFP systems can also be used to identify any “sound”, like speech, audio of movie snippets. The goal of this thesis is to improve the accuracy of a AFP system. We improve the step of landmarks extraction by considering the energy of landmarks, and assume that the peaks with higher energy are the more valuable features, thus combining them into landmarks with higher priority. In the final procedure, we calculate the landmark count with a given weight. With all methods, the system achieves 5.29% higher recognition rate than the original AFP system.	en
dc.description.provenance	Made available in DSpace on 2021-06-15T12:52:38Z (GMT). No. of bitstreams: 1 ntu-105-R03944037-1.pdf: 3144904 bytes, checksum: 2e7a545e77fc3d6798b309d63be8ea9b (MD5) Previous issue date: 2016	en
dc.description.tableofcontents	中文摘要............................................................................................................................i ABSTRACT ..................................................................................................................... ii 目錄 ................................................................................................................................. iii 圖目錄 ..............................................................................................................................vi 表目錄 ............................................................................................................................... x Chapter 1 緒論............................................................................................................ 1 1.1 研究概論 ........................................................................................................ 1 1.2 章節概要 ........................................................................................................ 1 Chapter 2 相關研究.................................................................................................... 2 2.1 系統架構 ........................................................................................................ 2 2.2 Haitsma 的方法 .............................................................................................. 3 2.2.1 特徵抽取 ............................................................................................... 3 2.2.2 特徵比對 ............................................................................................... 5 2.2.3 資料庫比對 ........................................................................................... 6 2.3 Wang 的方法 .................................................................................................. 8 2.3.1 特徵抽取 ............................................................................................... 8 2.3.2 建立資料庫 ......................................................................................... 14 2.3.3 資料庫比對 ......................................................................................... 16 2.4 GPU 加速 ..................................................................................................... 20 2.4.1 GPU 架構 .............................................................................................. 20 2.4.2 GPU 平行化方法 ................................................................................ 21 2.4.3 實驗結果 ............................................................................................. 23 Chapter 3 改良方法.................................................................................................. 25 3.1 改變地標最大連接數 .................................................................................. 25 3.1.1 假設 ..................................................................................................... 25 3.1.2 改良方法 ............................................................................................. 25 3.2 考量高峰點的能量 ...................................................................................... 25 3.2.1 假設 ..................................................................................................... 25 3.2.2 改良方法 ............................................................................................. 26 3.3 增加權重的比對地標分數 .......................................................................... 26 3.3.1 假設 ..................................................................................................... 26 3.3.2 改良概念 ............................................................................................. 27 3.3.3 改良方法 ............................................................................................. 27 Chapter 4 實驗結果與分析討論 ............................................................................. 28 4.1 實驗環境設定 .............................................................................................. 28 4.2 使用的資料集與語料 .................................................................................. 28 4.3 改變地標最大連接數 .................................................................................. 29 4.3.1 資料庫大小 ......................................................................................... 29 4.3.2 實驗結果 ............................................................................................. 29 4.3.3 此方法結論 ......................................................................................... 31 4.4 考量高峰點的能量 ...................................................................................... 31 4.4.1 資料庫大小 ......................................................................................... 32 4.4.2 實驗結果 ............................................................................................. 32 4.4.3 此方法結論 ......................................................................................... 34 4.4.4 分析音樂片段之高峰點能量 ............................................................. 35 4.5 增加權重的比對地標分數 .......................................................................... 36 4.5.1 分析地標 ............................................................................................. 36 4.5.2 實驗結果 ............................................................................................. 39 4.5.3 此方法結論 ......................................................................................... 46 Chapter 5 結論與未來展望 ..................................................................................... 47 5.1 結論 .............................................................................................................. 47 5.2 未來展望 ...................................................................................................... 48 REFERENCE .................................................................................................................. 49
dc.language.iso	zh-TW
dc.subject	音樂檢索	zh_TW
dc.subject	特徵抽取	zh_TW
dc.subject	特徵抽取	zh_TW
dc.subject	聲紋辨識	zh_TW
dc.subject	音樂檢索	zh_TW
dc.subject	聲紋辨識	zh_TW
dc.subject	landmark	en
dc.subject	music retrieval	en
dc.subject	landmark	en
dc.subject	feature extraction	en
dc.subject	audio fingerprinting	en
dc.subject	audio fingerprinting	en
dc.subject	music retrieval	en
dc.subject	feature extraction	en
dc.title	以地標為特徵之音訊指紋系統的改進	zh_TW
dc.title	Improvement of Landmark-based Audio Fingerprinting	en
dc.type	Thesis
dc.date.schoolyear	104-2
dc.description.degree	碩士
dc.contributor.oralexamcommittee	王新民(Hsin-Min Wang),鄭士康(Shyh-Kang Jeng)
dc.subject.keyword	音樂檢索,聲紋辨識,特徵抽取,	zh_TW
dc.subject.keyword	music retrieval,audio fingerprinting,feature extraction,landmark,	en
dc.relation.page	50
dc.identifier.doi	10.6342/NTU201601030
dc.rights.note	有償授權
dc.date.accepted	2016-07-20
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	資訊網路與多媒體研究所	zh_TW
顯示於系所單位：	資訊網路與多媒體研究所

文件中的檔案：

檔案	大小	格式
ntu-105-1.pdf 未授權公開取用	3.07 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。