Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 電信工程學研究所
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/28744
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor李琳山
dc.contributor.authorPo-Han Chuen
dc.contributor.author祝伯翰zh_TW
dc.date.accessioned2021-06-13T00:20:29Z-
dc.date.available2007-07-31
dc.date.copyright2007-07-31
dc.date.issued2007
dc.date.submitted2007-07-25
dc.identifier.citation參考資料
[1] Y. Gong, “Speech Recognition in Noisy Environments: A Survey”,
Speech Communication. 16, 1995.
[2] A. E. Rosenberg, C.-H. Lee, and F. K. Soong, “Cepstral Channel
Normalization Techniques for HMM-based Speaker Verification”, ICSLP,
1992.
[3] O. Vikki and K. Laurila, “Noise Robust HMM-based Speech Recognition
Using Segmental Cepstral Feature Normalization”, in ESCA NATO Workshop
Robust Speech Recognition Unknown Communication Channels, France, 1997.
[4] F. Hilger and H. Ney, “Quantile Based Histogram Equalization for
Noise Robust Speech Recognition,” Eurospeech, 2001.
[5] A. de la Torre, J. C. Segura, C. Benitez, A. M. Peinado, and A. J.
Rubio, “Non-linear Transformations of the Feature Space for Robust
Speech Recognition”, ICASSP, 2002.
[6]H. Hermansky and N. Morgan, “RASTA Processing of Speech”, IEEE Trans.
on Speech and Audio Processing. 2, 1994.
[7]C. Avendano, S. v. Vuuren, and H. Hermansky, “Data Based Filter Design
for RASTA-like Channel Normalization for ASR”, ICSLP, 1996.
[8]J.-w. Huang, et al “Comparative Analysis for Data-Driven Temporal
Filters Obtained via Principal Component Analysis (PCA) and Linear
Discriminant Analysis (LDA) in Speech Recognition”, Eurospeech 2001.
[9]M.J.F. Gales, “Model-based Techniques for Noise Robust Speech
Recognition”, University of Cambridge, Sep. 1995.
[10] J.W. Hung, J.L. Shen, L.S. Lee, “New Approaches for Domain
Transformation and Parameter Combination for Improved Accuracy in
Parallel Model Combination (PMC) Techniques”, IEEE Trans. on Speech and
Audio Processing, Nov. 2001.
[11] J.L. Gauvain and C.H.Lee, “Maximum a Posteriori Estimation for
Multivariate Gaussian Mixture Observations of Markov Chains”, IEEE Trans.
on Speech and Audio Processing, 1994.
[12] C.J. Leggetter and P.C. Woodland, “Maximum Likelihood Linear
Regression for Speaker Adaptation of Continuous Density Hidden Markov
Models”, Computer Speech and Language, 1995.
[13] P. Lockwood and J. Boudy, “Experiments with a Nonlinear Spectral
Subtractor (NSS), Hidden Markov Models and the Projection, for Robust
Speech Recognition in Cars”, Eurospeech 1991.
[14]J. Sohn, N. S. Kim, and W. Sung, “A Statistical Model-Based Voice
Activity Detection”, IEEE Signal Processing Letters, Vol. 6, No. 1,
January 1999.
[15]J. Ramirez et al, “A Adaptive Long-Term Spectral Estimation Voice
Activity Detection”, Eurospeech, 2003.
[16] B.A. Mellor and A.P. Varga, “Noise Masking in the MFCC Domain for
the Recognition of Speech in Background Noise”, ICASSP 1992.
[17] Y. Ephraim and H.L. Van Trees, “A Signal Subspace Approach for Speech
Enhancement”, IEEE Trans. on Speech and Audio Processing, 1995.
[18] J. Droppo, L. Deng and A. Acero, “Evaluation of SPLICE on the Aurora
2 and 3 Tasks,” ICSLP 2002.
[19] Jancovic, P., Ming, J., “Combing the Union Model and Missing Feature
Method to Improve Noise Robustness in ASR”, ICASSP 2002
[20] D. Macho, et al, “Evaluation of a Noise-Robust DSR Front-End on
AURORA Databases”, ICSLP 2002.
[21] D. Macho, et al, “Evaluation of a Noise-Robust DSR Front-End on
AURORA Databases”, ICSLP 2002.
[22] Raj, B., Singh, R. and Stern, R., ”On Tracking Noise with Linear Dynamical System Models”, in Proc. ICASSP, Montreal, 2004.
[23]A.doucet,S.Godsill,and C.Andrieu,”On sequential Monte Carlo sampling methods for Bayesian filtering,”Statistics and Computing, vol.10, no.3,pp. 197-208,2000
[24] Deng. L., Droppo, J., and Acero, A., ”Enhancement of Log Mel Power Spectra of Speech Using a Phase-Sensitive Model of the Acoustic Environment and Sequential Estimation of the Corrupting Noise”, IEEE T-SAP, vol. 12,no. 2, pp. 133-143, March 2004.
[25] Moreno, P., Raj, B. and Stern, R.,A Vector Taylor Series Approach for Environment-Independent Speech Recognition”,in Proc.ICASSP,Atlanta, 1996.
[26] Raj, B., Singh, R. and Stern, R., ”On Tracking Noise with
Linear Dynamical System Models”, in Proc. ICASSP,Montreal, 2004.
[27] Singh, R. and Raj, B., ”Tracking Noise via Dynamical Systems with a Continuum of States”, in Proc. ICASSP, Miami, 2003.
[28]J.C Segura,A. de la Torre,M.C. Benitez,A.M. Peinado “Model_based cpmpensation of the additive noise for continuous speech recognition. Experiments using the Aurora 2 database and tasks”,in Eurospeech 2001
[29] http://www.gps.caltech.edu/~tapio/arfit
[30] Simon Haykin , Adaptive Filter Theory ,fourth edition. 2002 by Prentice-Hall, Inc.
[31] P.J.Moreno, B.Raj, and R.M.Stern: “A Vector Taylor Series Approach for Environment-Independent Speech Recognition”
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/28744-
dc.description.abstract為了使語音可以成為隨時隨地都可以使用的人機介面,探討如何降低環境不匹配對辨識率影響的強健性研究,變成為一個很重要的研究方向。本論文即是藉由在前端對「對數梅爾頻譜能量」的處理,來提升對聲學環境改變的強健性
本論文應用粒子群演算法追蹤在「對數梅爾頻譜能量」上的雜訊,它能夠利用一群粒子模擬雜訊的分佈,並且找到接近真正雜訊的向量,隨之利用最小方均差去雜訊法將追蹤到的雜訊從含雜訊的語音中去除掉。
若要達到準確的預測,粒子群必須要先取樣在真正雜訊的附近,因此我們用三種方法作最初的取樣,1.「隨機撒種法」, 2. 「自我迴歸模型」,3. 「延伸式卡式濾波器」,最後實驗證明「延伸式卡式濾波器」最能預測雜訊的位置。在國際標準測試環境Aurora2之下對各種雜訊及各種訊噪比進行平均,使用「延伸式卡式濾波器」為先前取樣的粒子群濾波器其辨識率為77.77,使用前為60.06;除了地鐵和展覽場雜訊之外,其他各種雜訊環境下的辨識率均獲得有效的提升。
zh_TW
dc.description.provenanceMade available in DSpace on 2021-06-13T00:20:29Z (GMT). No. of bitstreams: 1
ntu-96-R94942052-1.pdf: 752726 bytes, checksum: 5732448e53ae39dd8e6070a6d7184524 (MD5)
Previous issue date: 2007
en
dc.description.tableofcontents目 錄
封面………………………………………………………………………………… i
口試委員會審定書………………………………………………………………… ii
誌謝…………………………………………………………………………………iii
中文摘要…………………………………………………………………………… iv
目錄………………………………………………………………………………… v
圖目錄…………………………………………………………………………… vii
表目錄…………………………………………………………………………… viii
第一章 導論………………………………………………………………………… 1
1.1 研究動機………………………………………………………………… 1
1.2 強健性處理方法………………………………………………………… 1
1.3 主要成果………………………………………………………………… 3
1.4 章節摘要………………………………………………………………… 3
第二章 研究背景 ………………………………………………………………… 5
2.1 背景知識………………………………………………………………… 5
2.1.1 倒頻譜平均消去法……………………………………………… 5
2.1.2 倒頻譜正規化法………………………………………………… 6
2.1.3 向量自我迴歸模型……………………………………………… 6
2.2 基礎系統之建立………………………………………………………… 7
2.2.1 語料介紹………………………………………………………… 7
2.2.2 語音特徵參數…………………………………………………… 12
2.2.3 聲學模型以及基礎系統之實驗結果…………………………… 13
2.2.3.1 聲學模型………………………………………………… 13
2.2.3.2 基礎系統之實驗結果…………………………………… 13
2.3 本章結論………………………………………………………………… 16
第三章 應用粒子群演算法於強健式語音辨識………………………………… 17
3.1 粒子群演算法介紹……………………………………………………… 17
3.2 應用粒子群演算法於強健性處理方法………………………………… 24
3.2.1 狀態模型介紹…………………………………………………… 24
3.2.2 粒子群濾波器設計……………………………………………… 26
3.2.3 最小方均差去除雜訊法………………………………………… 30
3.3 本章結論………………………………………………………………… 31
第四章 粒子群演算法的多種不同實作方法…………………………………… 33
4.1 最小方均差去雜訊法(MMSE Denoise)………………………………… 35
4.2 以向量為單位的粒子群演算法………………………………………… 38
4.2.1 隨機撒種法……………………………………………………… 38
4.2.2 向量自我迴歸模型粒子濾波器………………………………… 40
4.3 以個別參數為單位的粒子演算法……………………………………… 42
4.3.1 以個別參數為單位的隨機撒種法……………………………… 44
4.3.2 以個別參數為單位的自我迴歸模型…………………………… 46
4.4 綜合比較………………………………………………………………… 46
4.5 本章結論………………………………………………………………… 48
第五章 延伸式卡式粒子群濾波器……………………………………………… 49
5.1 理論背景………………………………………………………………… 50
5.1.1 用延伸式卡式濾波器取樣(預測)……………………………… 50
5.1.2 應用含有雜訊語料的高斯混和分佈計算比重………………… 52
5.1.3 最小方均差去雜訊法…………………………………………… 52
5.2 實驗結果………………………………………………………………… 53
5.3 本章結論………………………………………………………………… 55
第六章 結論與展望…………………………………………………………………56
6.1 結論……………………………………………………………………… 56
6.2 展望……………………………………………………………………… 55
參攷資料………………………………………………………………58
dc.language.isozh-TW
dc.subject最小方均差去雜訊法zh_TW
dc.subject粒子群演算法zh_TW
dc.subject追蹤zh_TW
dc.subject強健性語音zh_TW
dc.subject去雜訊zh_TW
dc.subjectParticle Filteren
dc.subjectMMSE denoiseen
dc.subjectRobust speech recognitionen
dc.title強健性語音辨識中使用粒子群演算法之前端特徵處理zh_TW
dc.titleFront-End Feature Processing using Particle Filter for Robust Speech Recognitionen
dc.typeThesis
dc.date.schoolyear95-2
dc.description.degree碩士
dc.contributor.oralexamcommittee鄭秋豫,陳信宏,王小川
dc.subject.keyword粒子群演算法,追蹤,強健性語音,去雜訊,最小方均差去雜訊法,zh_TW
dc.subject.keywordParticle Filter,Robust speech recognition,MMSE denoise,en
dc.relation.page60
dc.rights.note有償授權
dc.date.accepted2007-07-27
dc.contributor.author-college電機資訊學院zh_TW
dc.contributor.author-dept電信工程學研究所zh_TW
顯示於系所單位:電信工程學研究所

文件中的檔案:
檔案 大小格式 
ntu-96-1.pdf
  未授權公開取用
735.08 kBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved