Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 電機工程學系
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/34747
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor賴飛羆(Fei-Pei Lai 賴飛羆)
dc.contributor.authorChun-Jung Hsiaoen
dc.contributor.author蕭鈞榮zh_TW
dc.date.accessioned2021-06-13T06:34:12Z-
dc.date.available2006-01-26
dc.date.copyright2006-01-26
dc.date.issued2006
dc.date.submitted2006-01-20
dc.identifier.citation【1】M. Abe, S. Nakamura, K. Shikano, and H. Kuwabara. “Voice conversion through vector quantization”. J. Acoust. Soc. Jpn.(E), Vol. 11, No. 2, pp. 71–76, 1990.
【2】M. Mashimo, T. Toda, H. Kawanami. K. Shikano, and N.Campbell. “Cross-language voice conversion evaluation using bilingual databases”. IPSJ Journal, Vol. 43, No. 7, pp. 2177–2185, 2002.
【3】T. Toda. “High-quality and flexible speech synthesis with segment selection and voice conversion”. Ph.D. Thesis, Graduate School of Information Science, Nara Institute of Science and Technology, 2003.
【4】Ki Seung Lee, Dae Hee Yun, and Il Whan Cha, “A New Voice Transformation Method based on Both Linear and Nonlinear Prediction Analysis”. The International Conference on Spoken Language Processing,1401-1404,Philadelphia,USA,October 1996.
【5】M. Tamura, T. Masuko, K. Tokuda, and T. Kobayashi. ”Adaptation of pitch and spectrum for HMM-based speech synthesis using MLLR”. Proc. ICASSP, pp. 805–808, Salt Lake City, USA, May 2001.
【6】H. Kawanami, Y. Iwami, T. Toda, H. Saruwatari, K. Shikano, ``GMM-based Voice Conversion Applied to Emotional Speech Synthesis,' Proc. European Conference on Speech Communication and Technology (EUROSPEECH2003), pp. 2401-2404, Geneva, Switzerland, Sep. 2003.
【7】Childers, D. G., 'Glottal source modeling for voice conversion', Speech Communication, vol. 16, pp. 127 - 138, 1995.
【8】M. A. Kohler, “A Comparison of the New 2400 BPS MELP Federal Standard with other Standard Coders”, Proc. of the Int. Conf. Acoust., Speech and Signal Processing, 1997.

【9】John Puterbaugh ,“Voice Conversion” :
http://silvertone.princeton.edu/~john/voiceconversion.htm
【10】林明灶,” 中文音節辨認之研究—混合模型法”, 大同大學電機工程研究所博士論文,2003
【11】徐志文,”國語關鍵詞擷取與發音確認之研究 ”,國立台灣大學碩士論文,2000
【12】L.M. Arslan, D.Talkin, .Voice conversion by codebook mapping of line spectral frequences and excitation spectrum., Proceedings EUROSPEECH, 1997,3:1347-1350.
【13】L.M. Supplee, R.P. Cohn, J.S. Collura, A.V. McCree, 'MELP: The New Federal Standard at 2400 bps,' IEEE International Conference on Acoustics, Speech, and Signal Processing, Munich, Germany, 1997.
【14】M.-T. Lin, C.-K. Lee and C.-Y. Lin, 'Consonant/ vowel segmentation for Mandarin syllable recognition,' Comp. Speech and Lang., vol. 13, pp. 207-222, 1999.
【15】Speech conversion using MELP speech coding algorithm Salor, O.; Demirekler, M.; Signal Processing and Communications Applications Conference, 2004. Proceedings of the IEEE 12th 28-30 April 2004 Page(s):268 - 271
【16】 楊東敏, 基於線性預測編碼及音框週期同步之高品質語音變換技術, 碩士論文, 國立中央大學, 2003
【17】 江佩芳,” 混合激發線性預測語音編碼之研究 ”,國立成功大學碩士論文,2001
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/34747-
dc.description.abstract本篇論文主要的研究方向是將語音變換方法架構在2.4kbps低位元率的混合激發線性預測(Mixed Excitation Linear Prediction)語音編碼器上,以便實際應用在即時通訊之中,增添娛樂性質甚至保密功能。
經由大量語料統計發現,在相同語者說話語音的相同音節發音當中,使用MELP編碼器分析而得的四階線頻譜(Line Spectrum Frequency)參數,其第一階及第二階參數在向量索引(index)的分布上具有多數聚集的特性。本論文提出以音節為基礎的對照方式,建造一來源語者與目標語者的口腔頻譜特徵對照表,以改善因為選錯音節而造成不連續語音的情形;另外線性調整兩語者的基頻週期,改變語者語音的原始激發訊號(Residual Signal);經由模擬實驗結果證實,來源語者確實可以改變成目標語者的效果,而合成語音的品質也令人滿意。
zh_TW
dc.description.abstractIn this work we focused on reusing parameters of 2.4kbps Mixed Excitation Linear Prediction (MELP) voice coder, implement the speech conversion from source speaker to the specified target speaker.
Using MELP algorithm to analyze the speech, statistically we found that for the same phoneme of the same speaker, the first and second stage indexes of MELP 4-stage vector quantized Line Spectral Frequency (LSF) tend to collect around some certain index values. We proposed a method that based on Mandarin syllable to build up a mapping table of these indexes between the spectral features of the source and the target speakers. To avoid the discontinued voice that caused by mismatching of the syllable, we proposed a new segmental technique based on feature vector frame. The pitch periods of residual signal were also modified using linear relationship. The simulation results show that the source speaker can be changed to the target speaker, and the quality of synthesized voice is good.
en
dc.description.provenanceMade available in DSpace on 2021-06-13T06:34:12Z (GMT). No. of bitstreams: 1
ntu-95-P92921005-1.pdf: 982140 bytes, checksum: 7f01b9be2063ef5d42a713b21d34b895 (MD5)
Previous issue date: 2006
en
dc.description.tableofcontents中文摘要 i
Abstract ii
致謝 iii
Contents iv
List of Figures vi
List of Tables vii
Chapter 1 INTRODUCTION 1
1.1. Motive 1
1.2. Background and Related works 2
1.3. Research Methodology 3
1.4. Organization 3
Chapter 2 RELATED TECHNOLOGIES OVERVIEW 5
2.1. Fundamental Knowledge of Speech 5
2.1.1. Vocal System 5
2.1.2. Characteristic of Mandarin Speech 6
2.2. Speech Conversion Introduction 9
2.3. MELP Speech Coding Basics 11
2.3.1. Encoder 11
2.3.2. Decoder 14
Chapter 3 RESEARCH METHODOLOGY 17
3.1. Source-Filter Model 17
3.1.1. Vocal Tract Filter 17
3.1.2. Excitation 20
3.2. Method of Spectral Mapping 22
3.2.1. Mandarin Syllable 22
3.2.2. Dynamic Time Warping 22
3.3. Syllable Segments 26
3.3.1. Methodology 26
Chapter 4 SIMULATION AND RESULTS 33
4.1. Simulation 33
4.1.1. Input Speech Data 33
4.1.2. Model Training 33
4.1.3. Modified Speech 35
4.2. Results 36
4.2.1. Mono-syllables 36
4.2.2. Continuous sentences 39
Chapter 5 CONCLUSIONS 43
REFERENCES 45
dc.language.isoen
dc.subject語音轉換zh_TW
dc.subject混合激發線性預測zh_TW
dc.subject國語音節zh_TW
dc.subjectMELPen
dc.subjectMandarin syllableen
dc.subjectSpeech Conversionen
dc.title中文語音轉換在混合激發線性預測語音編碼器上之實現zh_TW
dc.titleImplement Mandarin Speech Conversion on Mixed Excitation Linear Prediction (MELP) CODECen
dc.typeThesis
dc.date.schoolyear94-1
dc.description.degree碩士
dc.contributor.oralexamcommittee鄭士康(Shyh-Kang Jeng 鄭士康),陳柏誠(Po-Cheng Chen 陳柏誠)
dc.subject.keyword語音轉換,混合激發線性預測,國語音節,zh_TW
dc.subject.keywordMELP,Speech Conversion,Mandarin syllable,en
dc.relation.page46
dc.rights.note有償授權
dc.date.accepted2006-01-23
dc.contributor.author-college電機資訊學院zh_TW
dc.contributor.author-dept電機工程學研究所zh_TW
顯示於系所單位:電機工程學系

文件中的檔案:
檔案 大小格式 
ntu-95-1.pdf
  未授權公開取用
959.12 kBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved