請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/50690| 標題: | 三連發聲特徵與多輸入多目標之深層類神經網路 Tri-Articulatory Feature and Multi-input/Multi-target Deep Neural Network |
| 作者: | Chih-Hsiang Yang 楊植翔 |
| 指導教授: | 李琳山 |
| 關鍵字: | 發聲特徵,瓶頸特徵,深層類神經網路,多目標學習之深層類神經網路,多輸入特徵之深層類神經網路, articulatory feature,bottleneck feature,deep neural network(DNN),multi-target DNN,multi-input DNN, |
| 出版年 : | 2016 |
| 學位: | 碩士 |
| 摘要: | 三連發聲特徵(tri-articulatory feature, tri-AF)是一種考慮前後文的發聲特徵。人在說話時,口型會連續變化,故前後連接的音素不同時,相同的發聲特徵應有所不同。本論文將發聲特徵分為八大類別,每個類別皆建成考慮前後文的隱藏馬可夫模型(Context-dependent Hidden Markov Model),藉此得到三連發聲特徵標記。
在語音辨識中,深層類神經網路(deep neural network, DNN)已廣泛被用來建構聲學模型。多訓練目標之深層類神經網路亦已被證實能夠改善模型的表現,故本論文以此為基本架構,使用三連音素、字母與三連發聲特徵為多重訓練目標,以增強聲學模型。 此外,兩階段的深層類神經網路模型在近期也被廣泛使用,第一階段的深層類神經網路作為特徵抽取之用,將抽出的特徵和聲學特徵結合,作為第二階段深層類神經網路的輸入。本論文將聲學特徵結合字母、三連發聲特徵、單語言瓶頸特徵與多語言瓶頸特徵等,實現多輸入特徵之深層類神經網路。 最後,本論文結合上述兩者,實現多輸入特徵/多訓練目標之深層類神經網路,兩者相輔相成,得到最佳的實驗結果。 Tri-articulatory feature(Tri-AF) is a context-dependent articulatory feature. When we speak, the shape of mouth change continuously. Therefore, the same phone with different context should be different in articulatory feature. In this thesis, the articulatory feature is categorized into eight groups; construct context-dependent Hidden Markov Model for each group, and then we can get tri-AF labels. In speech recognition, deep neural network(DNN) has been widely used for acoustic model, and multi-target training DNN has been demonstrated that it can improve acoustic model. Accoding to this concept, this paper uses triphone, tri-AF, grapheme as multitarget to enhance the acoustic model. On the other hand, two-stage DNN is also popular in recent year. The first stage acts as feature extraction model; concatenate the extracted feature with acoustic feature to be the input of second stage. This thesis uses grapheme, tri-AF, monolingual bottleneck feature and multilingual bottleneck feature as extra input to realize multi-input DNN. Finally, combining multi-target and multi-input to fulfill multi-input/multi-target DNN, and we can get the best recognition results. |
| URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/50690 |
| DOI: | 10.6342/NTU201601024 |
| 全文授權: | 有償授權 |
| 顯示於系所單位: | 電信工程學研究所 |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| ntu-105-1.pdf 未授權公開取用 | 7.31 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
