請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/46439完整後設資料紀錄
| DC 欄位 | 值 | 語言 |
|---|---|---|
| dc.contributor.advisor | 李琳山(Lin-Shan Lee) | |
| dc.contributor.author | Chao-Hong Meng | en |
| dc.contributor.author | 孟昭宏 | zh_TW |
| dc.date.accessioned | 2021-06-15T05:09:05Z | - |
| dc.date.available | 2010-07-26 | |
| dc.date.copyright | 2010-07-26 | |
| dc.date.issued | 2010 | |
| dc.date.submitted | 2010-07-26 | |
| dc.identifier.citation | [1] Lawrence R. Rabiner, A tutorial on hiddenMarkov models and selected applications
in speech recognition, Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 1990. [2] J. K. Baker, “The dragon system - an overview,” in IEEE Trans. Acoust. Speech Signal Process, 1975, pp. 24–29. [3] L. Bahl, P. Brown, P de Souza, and R. Merce, “Maximum mutual information estimation of hidden markov model parameters for speech recognition,” in ICASSP 1986, 1986. [4] B.-H. Juang,W. Chou, and C.-H Lee, “Minimum classification error rate methos for speech recognition,” in IEEE Transactions on Speech and Audio Processing, 1997. [5] D. Povey and P.C.Woodland, “Minimum phone error and i-smoothing for improved discriminative training,” in ICASSP 2002, 2002. [6] J. Zheng and A. Stolcke, “Improved discriminative training using phone lattices,” in Interspeech 2005, 2005. [7] J. Du, P. Liu, F. K. Soong, J.-L. Zhou, and R.-H.Wang, “Minimum divergence based discriminative training,” in Interspeech 2006, 2006. [8] Daniel Jurafsky and James H. Martin, Speech and Language Processing, Pearson Education Taiwan Ltd., 2005. [9] Leonard E. Baum and Ted Petrie, “Statistical inference for probabilistic functions of finite state markov chains,” The Annals of Mathematical Statistics, vol. 37, no. 6, pp. 1554–1563, 1966. [10] Hynek Hermansky Daniel, Daniel P. W. Ellis, and Sangita Sharma, “Tandem con- nectionist feature extraction for conventional hmm systems,” in Proc. ICASSP, 2000, pp. 1635–1638. [11] Eric Fosler Lussier and Jeremy Morris, “Crandem systems: Conditional randem field acoustic models for hidden markove model,” in Proc. ICASSP, 2008, pp. 4049– 4052. [12] Asela Gunawardana, Milind Mahajan, Alex Acero, and John C. Platt, “Hidden conditional random fields for phone classification,” in in Interspeech, 2005, pp. 1117–1120. [13] Yun-Hsuan Sung, Constantinos Boulis, Christopher Manning, and Dan Jurafsky, “Regularization, adaptation, and non-independent features improve hidden condi- tional random fields for phone classification,” in IEEE ASRU 2007, 2007, pp. 639– 642. [14] J. Morris and E. Fosler-Lussier, “Discriminative phonetic recognition with condi- tional random fields,” in HLT-NAACLWorkshop on Computationally Hard Problems and Joint Inference in Speech and Language Processing, 2006. [15] Tsochantaridis Ioannis, Hofmann Thomas, Joachims Thorsten, and Altun Yasemin, “Support vector machine learning for interdependent and structured output spaces,” in ICML ’04, New York, NY, USA, 2004, p. 104, ACM. [16] Yasemin Altun, Ioannis Tsochantaridis, and Thomas Hofmann, “Hidden markov support vector machines,” 2003. [17] Thorsten Joachims, “A support vector method for multivariate performance mea- sures,” in ICML ’05: Proceedings of the 22nd international conference on Machine learning, New York, NY, USA, 2005, pp. 377–384, ACM. [18] Yue Yisong, Finley Thomas, Radlinski Filip, and Joachims Thorsten, “A support vector method for optimizing average precision,” in SIGIR ’07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, New York, NY, USA, 2007, pp. 271–278, ACM. [19] Thorsten Joachims, “Training linear svms in linear time,” in KDD ’06: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, New York, NY, USA, 2006, pp. 217–226, ACM. [20] S. Sathiya Keerthi and S. Sundararajan, “Crf versus svm-struct for sequence label- ing,” Tech. Rep., Yahoo Research Techinical Report, 2007. [21] Vladimir N. Vapnik, The nature of statistical learning theory, Springer-Verlag New York, Inc., New York, NY, USA, 1995. [22] Christopher J.C. Burges, “A tutorial on support vector machines for pattern recog- nition,” Data Mining and Knowledge Discovery, vol. 2, pp. 121–167, 1998. [23] John Platt Microsoft and John C. Platt, “Sequential minimal optimization: A fast algorithm for training support vector machines,” Tech. Rep., Advances in Kernel Methods - Support Vector Learning, 1998. [24] Ioannis Tsochantaridis, Thorsten Joachims, Thomas Hofmann, and Yasemin Al- tun, “Large margin methods for structured and interdependent output variables,” J. Mach. Learn. Res., vol. 6, pp. 1453–1484, 2005. [25] Tom Minka, “Discriminative models, not discriminitave training,” Tech. Rep., Mi- crosoft Research, 2005. [26] X. Li, H. Jiand, and C. Liu, “Large margin hmms for speech recognition,” in IEEE Transactions on Acoustics, Speech and Signal Processings (ICASSP 05), 2005, pp. 513–516. [27] K. Lee and H. Hon, “Speaker-independent phone recognition using hidden markov models,” in IEEE Transactions on Acoustics, Speech and Signal Processings, 1989, pp. 1641–1648. [28] X. Huang, A. Acero, and H.-W. Hon, Spoken Language Processing, Pearson Edu- cation Taiwan Ltd., 2005. [29] H. Hermansky, B. Hanson, and H. Wakita, “Perceptually based linear predictive analysis of speech,” Apr 1985, vol. 10, pp. 509–512. [30] S. Furui, “Cepstral analysis technique for automatic speaker verification,” Acoustics, Speech and Signal Processing, IEEE Transactions on, vol. 29, no. 2, pp. 254–272, Apr 1981. [31] Machine Intelligence Laboratory Cambridge University Engineering Dept. (CUED), “Htk,” http://htk.eng.cam.ac.uk. [32] Speech Group International Computer Science Institue, “Quicknet,” http:// www.icsi.berkeley.edu/Speech/qn.html. [33] Thorsten Joachims, “Svm-hmm,” http://www.cs.cornell.edu/People/ tj/svm_light/svm_hmm.html. | |
| dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/46439 | - |
| dc.description.abstract | 語音辨識(Speech Recognition)問題可視為針對一段語音訊號求出所對應的詞串。 這個問題由於結構十分複雜, 所以在傳統上, 我們都是將問題用貝氏定理(Bayes Theorem)拆解成聲學模型(Acoustic Model)與語言模型(Language Model)兩個子問題, 這兩個子問題結構比較單純, 方便我們用隱藏式馬可夫模型(Hidden Markov Model)來解決。
但隱藏式馬可夫模型估測參數的時候傳統上使用最大相似度估測法(Maximum Likelihood Estimation), 容易在不同模型之間造成混淆。 乃有人提出鑑別式訓練法(Discriminative Training), 讓傳統的模型架構也具備鑑別力。 隨著機器學習領域的發展, 我們逐漸有能力直接解決語音辨識的問題而未必需要將它拆成兩個子問題, 而這樣的模型多半天生就具備鑑別能力。 本論文便嘗試在這樣的架構下先進行初步的音素辨識。 論文中使用的模型為結構化支持向量機(Structural Support Vector Machine)。 實驗顯示, 所獲得之音素正確率(Phone Accuracy)會超過串接式系統(Tandem System)的1% | zh_TW |
| dc.description.provenance | Made available in DSpace on 2021-06-15T05:09:05Z (GMT). No. of bitstreams: 1 ntu-99-R96922007-1.pdf: 2324389 bytes, checksum: a3fa436f2645b585cf856cb6d3d8700f (MD5) Previous issue date: 2010 | en |
| dc.description.tableofcontents | 中文摘要 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii
一、導論 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 研究動機 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 語音辨識 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.3 相關研究 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.4 本論文主要的研究方法及成果 . . . . . . . . . . . . . . . . . . . . . . 7 1.5 章節安排 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 二、背景知識 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.1 連續分佈隱藏式馬可夫模型 . . . . . . . . . . . . . . . . . . . . . . . 9 2.1.1 模型結構 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.1.2 模型參數估測 . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.2 多層感知器(Multi-Layered Perceptron) . . . . . . . . . . . . . . . . . . 19 2.2.1 數學定義 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 2.2.2 模型參數估測 . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 2.3 串接式聲學模型(Tandem) . . . . . . . . . . . . . . . . . . . . . . . . . 24 2.4 本章總結 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 三、支撐向量機學習演算法 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 3.1 簡介 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 3.2 原始形式(Primal Form) . . . . . . . . . . . . . . . . . . . . . . . . . . 28 3.3 拉氏對偶性(Lagrange Duality) . . . . . . . . . . . . . . . . . . . . . . . 32 3.4 對偶形式 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 3.5 核心映射 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 3.6 不可分的情形 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 3.7 循序最小化方法(Sequential Minimal Optimization) . . . . . . . . . . . 41 3.8 本章總結 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 四、結構化支撐向量機學習演算法 . . . . . . . . . . . . . . . . . . . . . . . . 47 4.1 簡介 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 4.2 原始形式(Primal Form)及對偶形式(Dual Form) . . . . . . . . . . . . . 48 4.3 切平面方法(Cutting Plane Method) . . . . . . . . . . . . . . . . . . . . 53 4.4 結構化支撐向量機的優點 . . . . . . . . . . . . . . . . . . . . . . . . . 60 4.5 與鑑別式訓練法則之差異 . . . . . . . . . . . . . . . . . . . . . . . . . 60 4.6 本章總結 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 五、實驗及分析 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 5.1 實驗基礎架構 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 5.1.1 實驗語料 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 5.1.2 前端處理 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 5.1.3 訓練與辨識系統 . . . . . . . . . . . . . . . . . . . . . . . . . . 69 5.1.4 基礎實驗 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 5.1.5 結構化支撐向量機的建立 . . . . . . . . . . . . . . . . . . . . 71 5.1.6 系統評估標準 . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 5.2 結果分析 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 5.2.1 初步實驗結果 . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 5.3 本章總結 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 六、結論與展望 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 6.1 總結 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 6.2 未來展望 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 參考文獻 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 | |
| dc.language.iso | zh-TW | |
| dc.subject | 音素辨識 | zh_TW |
| dc.subject | 支撐向量機 | zh_TW |
| dc.subject | Phone Recognition | en |
| dc.subject | SVM | en |
| dc.title | 使用結構化支撐向量機之音素辨識 | zh_TW |
| dc.title | Phone Recognition using Structural Support Vector Machine | en |
| dc.type | Thesis | |
| dc.date.schoolyear | 98-2 | |
| dc.description.degree | 碩士 | |
| dc.contributor.oralexamcommittee | 王小川(Hsiao-Chuan Wang),陳信宏(Sin-Horng Chen),簡仁宗(Jen-Tzung Chien),鄭秋豫(Chiu-yu Tseng) | |
| dc.subject.keyword | 支撐向量機,音素辨識, | zh_TW |
| dc.subject.keyword | Phone Recognition,SVM, | en |
| dc.relation.page | 91 | |
| dc.rights.note | 有償授權 | |
| dc.date.accepted | 2010-07-26 | |
| dc.contributor.author-college | 電機資訊學院 | zh_TW |
| dc.contributor.author-dept | 資訊工程學研究所 | zh_TW |
| 顯示於系所單位: | 資訊工程學系 | |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| ntu-99-1.pdf 未授權公開取用 | 2.27 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
