標準漢語歌聲中歌詞辨識與一般語音辨識差異之研究

Chu-An Yu; 游筑安

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/73892

標題:	標準漢語歌聲中歌詞辨識與一般語音辨識差異之研究 Research on the Difference between Lyric Recognition and Speech Recognition in Mandarin Songs
作者:	Chu-An Yu 游筑安
指導教授:	黃乾綱(Chien-Kang Huang)
關鍵字:	語音辨識,音樂資訊檢索,卷積神經網路,伴奏歌聲,清唱訊號,漢語拼音, Speech recognition,Lyric recognition,Music Information Retrieval,Convolutional Neural Network,Mandarin pinyin,
出版年 :	2019
學位:	碩士
摘要:	歌曲搜尋為大眾生活中不可或缺的一部份，在不知道曲名、不知道歌手的情況下，只要能夠哼唱幾句，就可以搜尋到想找的歌曲。如今歌唱搜尋的網站、手機應用程式大多是採用旋律搜尋。但這樣的搜尋方式，對無法將準確音調重現的使用者們來說，並不方便。在旋律不準確的情況下，無法順利得到正確的結果。因此若能將歌聲中的字音辨識出來，將能大幅提升使用者搜尋的正確率。本研究目的為比較歌唱與朗讀語音的不同，並藉此提升歌唱中歌詞的語音辨識正確率。研究流程分成三部分，第一部分先對歌唱與朗讀的語音辨識做觀察比較，再由觀察結果決定實驗方向。第二部分做歌聲前處理，去除噪音及背景音，留下人聲。第三部分特徵抽取，經過預加重、加窗等處理，將音訊轉換成聲譜圖，做為歌唱模型訓練的輸入圖像。第四部分使用端對端卷積神經網路 (CNN) 搭配鏈結式時間分類算法 (CTC) 訓練語音模型，實現歌曲字音辨識的功能。 Song retrieval is an indispensable part in modern life. One expects to find the song simply by singing few words or humming a period of the it. Most websites and mobile apps use features of melody in song retrieval tasks nowadays. However, the search method is inconvenient for users who cannot sing in accurate tones. It will not get the correct result for the inaccurate melody. Therefore, if the words in the song can be recognized, it will greatly improve the accuracy of the song search. This research is divided into four parts. The first part compare singing audio and reading audio. The second part is preprocessing the song. It removes the background noise and forces vocal. The third part is extracting features. It converts the audio into a spectrogram as an input image for the training model. The fourth part uses convolutional neural network (CNN) model and connectionist temporal classification (CTC) model to train acoustic model.
URI:	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/73892
DOI:	10.6342/NTU201903515
全文授權:	有償授權
顯示於系所單位：	工程科學及海洋工程學系

文件中的檔案：

檔案	大小	格式
ntu-108-1.pdf 目前未授權公開取用	4.34 MB	Adobe PDF

顯示文件完整紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料（如：文字、圖片、PDF）並使其易於取用。