請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/7292
標題: | 利用多任務學習模型建立發音特徵來改善華語錯誤發音偵測與診斷之回饋 Mandarin Mispronunciation Detection and Diagnosis Feedback Using Articulatory Attributes Based Multi-task Learning |
作者: | Xuan-Bo Chen 陳宣伯 |
指導教授: | 張智星(Jyh-Shing Roger Jang) |
關鍵字: | 電腦輔助發音訓練,錯誤發音偵測,錯誤發音診斷,發音部位-發音方式,多任務學習,鑑別性訓練,時間延遲神經網路, computer assisted pronunciation training,mispronunciation detection,articulatory features,multi-task learning,discriminative training,time-delay neural networks, |
出版年 : | 2019 |
學位: | 碩士 |
摘要: | 此篇論文在探討電腦輔助發音訓練,我們聚焦於錯誤發音偵測以及提供與口腔模型相關的回饋。我們提出加入發音特徵(speech attributes),像是發音部位-發音方式 (place and manner),能有助於改善錯誤發音偵測與更為精準地提供錯誤發音診斷。實作上我們利用時間延遲神經網路(time-delay neural networks)並採用多任務學習策略(multi-task learning strategy)訓練了具鑑別力的發音模型並能輸出一個音素的發音分數(articulatory score),以及利用時間延遲神經網路訓練聲學模型並能輸出一個音素分數(phonetic score)。在測試階段,系統會基於發音分數與音素分數偵測發音錯誤並且給予一個精準的發音改進回饋。此論文實驗採用的語料為公視國語新聞廣播節目 (MATBN),並利用equal error rate (EER)、diagnosis accuracy (DA)來顯示深度類神經網路-隱式馬可夫模型(DNN-HMM)的表現比高斯混合模型-隱式馬可夫模型(GMM-HMM)來得好。除此之外,我們提出的方法能適用於各種語言,但此篇論文著重於華語的探討。 This paper presents our research on computer assisted pronunciation training (CAPT). We focus on mispronunciation detection and articulation feedback. We propose taking into account the speech attributes, namely place and manner of articulation, in the assessment models to improve mispronunciation detection and return precise articulation feedback to learners. We train a discriminative articulatory model based on time-delay neural networks (TDNNs) with the multi-task learning strategy to give the articulatory score and a TDNN-based acoustic model to give the phonetic score. In testing, the system detects mispronunciations and returns precise articulation feedback based on both the phonetic and articulatory scores. The results of experiments conducted on the MATBN Mandarin Chinese broadcast news corpus show that the proposed models outperform the Gaussian mixture model (GMM)-based and deep neural network (DNN)-based baselines in terms of equal error rate (EER) and diagnostic accuracy (DA). Furthermore, our mispronunciation detection system should work in any language, although the current system focuses on Mandarin. |
URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/7292 |
DOI: | 10.6342/NTU201901749 |
全文授權: | 同意授權(全球公開) |
電子全文公開日期: | 2024-07-25 |
顯示於系所單位: | 資訊工程學系 |
文件中的檔案:
檔案 | 大小 | 格式 | |
---|---|---|---|
ntu-108-1.pdf | 3.15 MB | Adobe PDF | 檢視/開啟 |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。