使用凱氏分歧度及階層式多層感知器的串接式聲學辨認

Shun-Lung Chen; 陳順隆

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/44650

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	李琳山
dc.contributor.author	Shun-Lung Chen	en
dc.contributor.author	陳順隆	zh_TW
dc.date.accessioned	2021-06-15T03:52:18Z	-
dc.date.available	2010-07-15
dc.date.copyright	2010-07-15
dc.date.issued	2010
dc.date.submitted	2010-07-08
dc.identifier.citation	[1] Defense Advance Research Projects Agency http://www.darpa.mil. [2] B.H. Juang L. Rabiner, Fundamentals of Speech Recognition, Prentice Hall, 1993. [3] Simon Haykin, Neural Networks A comprehensive Foundation, USA, 1999. [4] ICSI Speech FAQ: International Computer Science Institute(ICSI)，http://www.icsi.berkeley.edu/speech/faq/nn-train.html. [5] Daniel P.W. Ellis Hynek Hermansky Daniel and Sangita Sharma, “Tandem connectionist feature extraction for conventional hmm systems,” in ICASSP, 2000. [6] H. Bourlard and H. Morgan, Connectionist speech recognition: A hybrid approach, Kluwer Academic Publishers, Boston, USA, 1994. [7] Arlo Faria, “An investigaion of tandem mlp features for asr,” Tech. Rep., ICSI, July 2007. [8] T. M. Cover and J. A. Thomas, Information Theory, John Wiley, 1991. [9] Jithendra Vepa Guillermo Aradilla and Herve Bourlard, “An acoustic model based on kullback-leibler divergence for posterior features,” in ICASSP, 2007. [10] 張碩尹, 串接群聚階層式多層感知器聲學模型之中文大字彙語音辨識, 碩士論文，國立台灣大學電信工程學研究所, 2009. [11] Cambridge University Engineering Dept. (CUED), Machine Intelligence Laboratory, ”HTK,” http://htk.eng.cam.ac.uk/. [12] SRI Speech Technology and Research Laboratory, ”SRILM,” http://www.speech.sri.com/projects/srilm/. [13] International Computer Science Institute( ICSI)，http://www.icsi.berkeley.edu/Speech/qn.html. [14] 潘奕誠, 大字彙中文連續語音辨認之一段式及以詞圖為基礎之搜尋演算法, 碩士論文，國立台灣大學資訊工程研究所, 2002. [15] H.-W.Hon X. Huang, A.Acero, Spoken Language Processing, Pearson Education Taiwan Ltd., 2005. [16] S. Furui, “Cesptral analysis technique for automatic speaker verification,” in IEEE Trans. Acoustics, Speech and Signal Processing, 1981, vol. 29, pp. 254–272. [17] S. M. Katz., “Estimation of probabilities from sparse data for other language component of a speech recognizer,” in IEEE Trans. Acoustics, Speech and Signal Processing, 1987, vol. 35, pp. 400–401. [18] Joel Pinto S.R.M Prasanna B. Yegnanarayana Hynek Hermansky, “Significance of contextual information in phoneme recognitionf,” Tech. Rep., IDIAP, 2007. [19] Lin-Shan Lee Hao Tang, Chao-Hong Meng, “An initial attempt for phone recognition using structured support vector machine (svm),” 2010, ICASSP. [20] K. Lee and H. Hon, “Speaker-independent phone recognition using hidden markov models,” in IEEE Transactions on Acoustics, Speech and Signal Processings, 1989, pp. 1641–1648. [21] H. Hermansky, B. Hanson, and H. Wakita, “Perceptually based linear predictive analysis of speech,” Apr 1985, vol. 10, pp. 509–512. [22] Sherry Y. Zhao and Nelson Morgan, “Multi-stream sepctro-temporal features for robust speech recognition,” 2008, Interspeech.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/44650	-
dc.description.abstract	語音辨識可視為針對一段語音訊號求出所對應的詞串。在傳統上，我們將問題用貝氏定理拆解成聲學辨認與語言解碼兩個子問題，在聲學辨認中，隱藏馬可夫模型最為被廣泛使用。但傳統上隱藏馬可夫模型估測參數使用最大相似度估測法，容易在不同模型之間造成混淆。乃有人提出鑑別式訓練法，讓傳統的模型架構也更具鑑別力。隨著機器學習領域的發展，我們得以結合不同機器學習的技術以改進隱藏馬可夫模型，結合隱藏馬可夫模型與多層感知器的串接式系統己普遍受到肯定。本論文在串接式系統中引入凱氏分歧度，使得大字彙語音辨識正確率較傳統串接式系統得到進一步的提升。有許多研究利用階層式多層感知器取代單一多層感知器來提升大字彙語音辨識正確率，本論文在此架構中，也利用凱氏分歧度讓大字彙語音辨識正確率更加提升。本論文也在混合式系統中引入凱氏分歧度及階層式多層感知器運用在音素辨識上，也能較傳統混合式系統得到更佳的音素正確率。	zh_TW
dc.description.provenance	Made available in DSpace on 2021-06-15T03:52:18Z (GMT). No. of bitstreams: 1 ntu-99-R97921056-1.pdf: 1725250 bytes, checksum: 31079a0ae6f86a71ea482d99af2ad02c (MD5) Previous issue date: 2010	en
dc.description.tableofcontents	口試委員會審定書. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . i 中文摘要. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii 一、導論. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 研究動機. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 統計式語音辨識. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.3 聲學模型. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.4 語言模型. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.5 本論文的研究方法及成果. . . . . . . . . . . . . . . . . . . . . . . . . 4 1.6 章節安排. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 二、背景知識. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.1 多層感知器. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.2 串接式系統與混合式系統. . . . . . . . . . . . . . . . . . . . . . . . . 9 2.2.1 串接式系統. . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.2.2 混合式系統. . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.3 隱藏馬可夫/凱氏分歧度模型. . . . . . . . . . . . . . . . . . . . . . . 11 2.3.1 凱氏分歧度. . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.3.2 隱藏馬可夫/凱氏分歧度模型. . . . . . . . . . . . . . . . . . . 12 2.3.3 隱藏馬可夫/反凱氏分歧度模型. . . . . . . . . . . . . . . . . 13 2.3.4 隱藏馬可夫/對稱凱氏分歧度模型. . . . . . . . . . . . . . . . 14 2.4 實驗語音資料庫與模型設定. . . . . . . . . . . . . . . . . . . . . . . 14 2.4.1 實驗語料. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.4.2 訓練與辨識系統工具. . . . . . . . . . . . . . . . . . . . . . . 15 2.4.3 前端處理. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.4.4 聲學模型設定. . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.4.5 辭典與語言模型設定. . . . . . . . . . . . . . . . . . . . . . . 17 2.5 基礎實驗結果. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 三、凱氏分歧度串接式系統. . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 3.1 串接式系統基礎實驗. . . . . . . . . . . . . . . . . . . . . . . . . . . 19 3.2 凱氏分歧度串接式系統. . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.2.1 凱氏分歧度串接式系統. . . . . . . . . . . . . . . . . . . . . . 21 3.2.2 反凱氏分歧度串接式系統. . . . . . . . . . . . . . . . . . . . 22 3.2.3 對稱凱氏分歧度串接式系統. . . . . . . . . . . . . . . . . . . 22 3.3 實驗結果. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 3.4 本章結論. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 四、階層式凱氏分歧度多層感知器串接式系統. . . . . . . . . . . . . . . . . . 25 4.1 階層式多層感知器串接式系統. . . . . . . . . . . . . . . . . . . . . . 25 4.2 階層式凱氏分歧度多層感知器串接式系統. . . . . . . . . . . . . . . 28 4.3 階層式反凱氏分歧度多層感知器串接式系統. . . . . . . . . . . . . . 29 4.4 階層式對稱凱氏分歧度多層感知器串接式系統. . . . . . . . . . . . . 31 4.5 本章結論. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 五、階層式凱氏分歧度多層感知器凱氏分歧度串接式系統. . . . . . . . . . . 34 5.1 階層式凱氏分歧度多層感知器凱氏分歧度串接式系統. . . . . . . . . 34 5.2 階層式凱氏分歧度多層感知器反凱氏分歧度串接式系統. . . . . . . 35 5.3 階層式凱氏分歧度多層感知器對稱凱氏分歧度串接式系統. . . . . . 35 5.4 實驗結果. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 5.5 本章結論. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 六、音素辨識. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 6.1 實驗語音資料庫與模型設定. . . . . . . . . . . . . . . . . . . . . . . 40 6.1.1 實驗語料. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 6.1.2 訓練與辨識系統工具. . . . . . . . . . . . . . . . . . . . . . . 41 6.1.3 前端處理. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 6.1.4 基礎實驗聲學模型設定. . . . . . . . . . . . . . . . . . . . . . 44 6.2 基礎實驗結果. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 6.3 混合式系統. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 6.4 階層式多層感知器混合式系統. . . . . . . . . . . . . . . . . . . . . . 47 6.5 階層式凱氏分歧度多層感知器混合式系統. . . . . . . . . . . . . . . 48 6.6 本章結論. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 七、結論與展望. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 7.1 結論. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 7.2 展望. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 參考文獻. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
dc.language.iso	zh-TW
dc.subject	聲學模型	zh_TW
dc.subject	Acoustic Model	en
dc.title	使用凱氏分歧度及階層式多層感知器的串接式聲學辨認	zh_TW
dc.title	Acoustic Recognition Using Tandem System with Kullback-Leibler Divergence and Hierarchical Multi-layer Perceptron	en
dc.type	Thesis
dc.date.schoolyear	98-2
dc.description.degree	碩士
dc.contributor.oralexamcommittee	陳信宏,王小川,鄭秋豫
dc.subject.keyword	聲學模型,	zh_TW
dc.subject.keyword	Acoustic Model,	en
dc.relation.page	53
dc.rights.note	有償授權
dc.date.accepted	2010-07-08
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	電機工程學研究所	zh_TW
顯示於系所單位：	電機工程學系

文件中的檔案：

檔案	大小	格式
ntu-99-1.pdf 未授權公開取用	1.68 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。