請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/43000完整後設資料紀錄
| DC 欄位 | 值 | 語言 |
|---|---|---|
| dc.contributor.advisor | 李琳山(Lin-Shan Lee) | |
| dc.contributor.author | Shuo-Yiin Chang | en |
| dc.contributor.author | 張碩尹 | zh_TW |
| dc.date.accessioned | 2021-06-15T01:32:14Z | - |
| dc.date.available | 2009-07-24 | |
| dc.date.copyright | 2009-07-24 | |
| dc.date.issued | 2009 | |
| dc.date.submitted | 2009-07-20 | |
| dc.identifier.citation | 【1】 Defense Advanced Research Projects Agency http://www.darpa.mil/
【2】 National Institute of Standards and Technology http://www.nist.gov/index.html 【3】 X. Huang, A. Acero, H.-W. Hon, “Spoken Language Processing,” Pearson Education Taiwan Ltd., pp. 424-426, 2005 【4】 H. Hermansky, B. Hanson, and H. Wakita, “Perceptually based linear predictive analysis of speech,” Apr 1985, vol. 10, pp. 509–512. 【5】 L. Rabiner, B.H. Juang, “Fundamentals of Speech Recognition”, Prentice Hall, 1993, 【6】 Hynek Hermansky Daniel, Daniel P. W. Ellis, and Sangita Sharma, “Tandem connectionist feature extraction for conventional hmm systems,” CASSP, 2000 【7】 Bourlard, H. and Morgan, N., “Connectionist speech recognition: A hybrid approach”, Kluwer Academic Publishers,Boston, USA, 1994 【8】 Simon Haykin “Neural Networks A comprehensive Foundation”,USA 1999 【9】 ICSI Speech FAQ: International Computer Science Institute(ICSI), http://www.icsi.berkeley.edu/speech/faq/nn-train.html 【10】 Richard O.Duda, Peter E.Harty and David G.Stork “Pattern Classification”, Canada 2001 【11】 Zhu, Q., Chen, B., Morgan, N., and Stolcke, A., “On using MLP features in LVCSR”, ICSLP 2004 【12】 Chen, B., Zhu, Q., Morgan, N. “Learning long term temporal features in LVCSR using neural networks”, ICSLP 2004 【13】 International Computer Science Institute(ICSI), 64 http://www.icsi.berkeley.edu/Speech/qn.html 【14】 Cambridge University Engineering Dept. (CUED), Machine Intelligence Laboratory, “HTK,” http://htk.eng.cam.ac.uk/ 【15】 SRI Speech Technology and Research Laboratory, “SRILM,” http://www.speech.sri.com/projects/srilm/ 【16】 潘奕誠,『大字彙中文連續語音辨認之一段式及以詞圖為基礎之搜尋演算 法』,碩士論文,國立台灣大學資訊工程研究所,2002 【17】 X. Huang, A. Acero, H.-W. Hon, “Spoken Language Processing,” Pearson Education Taiwan Ltd., pp. 424-426, 2005 【18】 S. Furui, “Cepstral analysis technique for automatic speaker verification,” IEEE Trans. Acoustics, Speech and Signal Processing, Vol.29, No.2, pp. 254-272, 1981 【19】 S. M. Katz. “Estimation of Probabilities from Sparse Data for Other Language Component of a Speech Recognizer,” IEEE Trans. Acoustics, Speech and Signal Processing, Vol.35, No.3, pp.400-401, 1987. 【20】 S-Y. Chang and L-S Lee. ”Data-driven Clustered Hierarchical Tandem System for LVCSR”, Interspeech 2008 【21】 S-Y. Chang and L-S Lee. ” Improved Clustered Hierarchical Tandem System with Bottom-Up Processing”, ICASSP 2009 【22】 Guillermo Aradilla, Jithendra Vepa and Herv’e Bourlard “An Acoustic Model Based on Kullback-Leibler Divergence for Posterior Features” ICASSP 2007 【 23 】 Guillermo Aradilla, Herv’e Bourlard and Mathew Magimai Doss ”Using KL-based Acoustic Models in a Large Vocabulary Recognition Task” Interspeech 2008 【24】 R. Veldhuis, “The Centroid of the Symmetrical Kullback-Leibler Distance,” 65 IEEE Signal Processing Letters, vol. 9, pp.96–99, 2002 【25】 Valente,F. and Hermansky,H. ”Hierarchical and parallel processing of modulation spectrum for ASR application”, ICASSP 2008 【26】 Hermansky H. and Fousek P., “Multi-resolution rasta filtering for tandem-based ASR.,” Interspeech 2005 【27】 Sherry Y. Zhao and Nelson Morgan “Multi-Stream Spectro-Temporal Features for Robust Speech Recognition” Interspeech 2008 【28】 Fosler-Lussier,E. and Morris,J. “Crandem systems: Conditional random field acoustic models for hidden macrov models”, ICASSP 2008 【29】 R. Teunen and M. Akamine, “Speech recognition using soft decision trees” Interspeech, 2008. 【30】 Deepu Vijayasenan, Fabio Valente, Herve Bourlard, “Integration of TDOA Features in Information Bottleneck Framework for Fast Speaker Diarization” 【31】 Dong Wang, Javier Tejedor, Joe Frankel, Simon King and Jose Col’as ” Posterior-Based Confidence Measures for Spoken Term Detection” | |
| dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/43000 | - |
| dc.description.abstract | 在傳統的聲學模型中,連續機率密度隱藏馬可夫模型最為被廣泛使用。但是
連續機率密度隱藏馬可夫模型有一些無法克服的缺點,近年不少研究藉由不同的 訓練方法或是結合不同機器學習的技術以改進連續機率密度隱藏馬可夫模型,這 些方法在新一代的語音辨識技術上已漸受肯定並普受重視,而且有不少被實踐在 各項國際競賽中。本論文即是嘗試使用多層感知器來幫助聲學模型辨識的研究。 在本論文中,我們提出藉由音素分群建立的階層式多層感知器。一般串接模 型中以單一多層感知器學習概括性的音素分類,很難區分混淆的音素;本論文藉 由拆解概括性音素分類問題為一組針對性的階層式分類,將複雜的音素分類問題 分而治之,並且討論在不同的分群結構下階層式多層感知器的表現,之後再以由 下而上的訓練方法,進一步改進階層式多層感知器。 最後在以上述的方法為第一階段辨識,由隱藏馬可夫與多層感知器混合模型 以及隱藏馬可夫(KL)模型重新計分。這些方法在中文大字彙新聞辨識中都證實可 以使辨識正確率有明確進步。 | zh_TW |
| dc.description.provenance | Made available in DSpace on 2021-06-15T01:32:14Z (GMT). No. of bitstreams: 1 ntu-98-R96942045-1.pdf: 2955239 bytes, checksum: 1bab5de9fafbbb794df33d290d44c371 (MD5) Previous issue date: 2009 | en |
| dc.description.tableofcontents | 口試委員會審定書............................................................................................................ i
誌謝.................................................................................................................................. ii 中文摘要.......................................................................................................................... iii 內容大綱.......................................................................................................................... iv 圖目錄............................................................................................................................ viii 表目錄............................................................................................................................... x 第一章 緒論 ........................................................................................................... 1 1.1 研究動機 ........................................................................................................ 1 1.2 統計式語音辨識原理 .................................................................................... 2 1.3 聲學模型 ........................................................................................................ 3 1.4 語言模型 ........................................................................................................ 4 1.5 傳統聲學模型的特性 .................................................................................... 5 1.6 本論文的研究方法 ........................................................................................ 7 1.7 本論文的研究成果 ........................................................................................ 8 第二章 串接式聲學模型 ....................................................................................... 9 2.1 鑑別模型和生成模型的比較 ........................................................................ 9 2.2 類神經網路分類器 ...................................................................................... 11 2.3 串接式模型 .................................................................................................. 17 vi 2.4 長時間特徵的串接式模型 .......................................................................... 19 2.5 事後機率特徵的結合 .................................................................................. 22 2.6 實驗語音資料庫與模型設定 ...................................................................... 22 2.6.1 實驗語料 ............................................................................................ 22 2.6.2 訓練與辨識系統工具. ....................................................................... 23 2.6.3 前端處理. ........................................................................................... 23 2.6.4 聲學模型設定 .................................................................................... 23 2.6.5 辭典與語言模型設定. ....................................................................... 24 2.6.3 前端處理. ........................................................................................... 23 2.6.4 聲學模型設定 .................................................................................... 23 2.6.5 辭典與語言模型設定. ....................................................................... 24 2.7 基礎實驗結果 .............................................................................................. 25 2.7.1 多層感知器訓練與分類實驗. ........................................................... 25 2.7.2 串接模型大字彙辨識實驗. ............................................................... 26 第三章 群聚階層式串接模型 ............................................................................. 29 3.1 音素距離 ...................................................................................................... 30 3.2 階層式群聚法 .............................................................................................. 32 3.3 群聚階層式串接模型 .................................................................................. 36 3.3.1 高層感知器. ....................................................................................... 36 vii 3.3.2 末端感知器. ....................................................................................... 36 3.3.3 高層感知器和末端感知器的整合方法. ........................................... 36 3.4 實驗結果 ...................................................................................................... 37 3.5 分群結果分析 .............................................................................................. 40 3.6 本章結論 ...................................................................................................... 42 第四章 群聚階層式串接模型的改進 ................................................................. 43 4.1 基於群聚訓練的缺點 .................................................................................. 43 4.2 群聚階層式串接模型由下而上的處理 ...................................................... 44 4.3 實驗結果 ...................................................................................................... 46 4.3.1 基礎實驗 ............................................................................................ 46 4.2.2 實驗結果 ............................................................................................ 47 Chapter 5 隱藏馬可夫與多層感知器之混合模型 ................................................. 51 5.1 隱藏馬可夫與多層感知器混合模型 .......................................................... 51 5.1.1 隱藏馬可夫與多層感知器混合模型的架構 .................................... 51 5.1.2 隱藏馬可夫與多層感知器混合模型的訓練 .................................... 53 5.2 隱藏馬可夫(KL)模型 .................................................................................. 54 5.3 串接模型和混合模型的比較 ...................................................................... 57 5.4 實驗結果 ...................................................................................................... 58 viii 5.5 本章結論 ...................................................................................................... 58 第六章 結論與展望 ............................................................................................. 61 6.1 結論 .............................................................................................................. 61 6.2 展望 .............................................................................................................. 61 REFERENCE .................................................................................................................. 62 | |
| dc.language.iso | zh-TW | |
| dc.subject | 大字彙語音辨識 | zh_TW |
| dc.subject | 聲學模型 | zh_TW |
| dc.subject | 多層感知器 | zh_TW |
| dc.subject | Multi-layer Perceptron | en |
| dc.subject | LVCSR | en |
| dc.subject | Acoustic Model | en |
| dc.title | 串接群聚階層式多層感知器聲學模型之中文大字彙語
音辨識 | zh_TW |
| dc.title | Large Vocabulary Mandarin Speech Recognition
Based on Tandem System with Clustered Hierarchical Multi-layer Perceptron | en |
| dc.type | Thesis | |
| dc.date.schoolyear | 97-2 | |
| dc.description.degree | 碩士 | |
| dc.contributor.oralexamcommittee | 王小川(Hsiao-Chuan Wang),陳信宏(Sin-Horng Chen),鄭秋豫(Chiu-yu Tseng),簡仁宗(Jen-Tzung Chien) | |
| dc.subject.keyword | 大字彙語音辨識,聲學模型,多層感知器, | zh_TW |
| dc.subject.keyword | LVCSR,Acoustic Model,Multi-layer Perceptron, | en |
| dc.relation.page | 68 | |
| dc.rights.note | 有償授權 | |
| dc.date.accepted | 2009-07-20 | |
| dc.contributor.author-college | 電機資訊學院 | zh_TW |
| dc.contributor.author-dept | 電信工程學研究所 | zh_TW |
| 顯示於系所單位: | 電信工程學研究所 | |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| ntu-98-1.pdf 未授權公開取用 | 2.89 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
