Skip navigation

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets

Learn More
DSpace logo
English
中文
  • Browse
    • Communities
      & Collections
    • Publication Year
    • Author
    • Title
    • Subject
    • Advisor
  • Search TDR
  • Rights Q&A
    • My Page
    • Receive email
      updates
    • Edit Profile
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 電機工程學系
Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/41669
Title: 人聲發音機制之探討
Research on Pronunciation Mechanisms of Human Voice
Authors: Xuan Huang
黃璿
Advisor: 陳永耀(Yung-Yaw Chen)
Keyword: 語音辨認,聲學特徵,發音模型,發聲腔道,共振峰,
Speech Recognition,Acoustics Feature,Pronouncing Model,Vocal Tract Filter,Formant,
Publication Year : 2009
Degree: 碩士
Abstract: 在人工智慧的相關領域中,使得電腦習得自動化處理視覺及聽覺信息且更進一步從而的組織其中的意義乃是近年來的熱門課題。而使用聲音傳遞感受或訊息交流更是當中最直接且迅速的方式,並被大量的運用於日常生活中。故本論文將針對語音訊號做深入的分析與探討。
本論文探討的課題是如何找出具有物理意義的聲學特徵做為語音辨認的參考。語音辨認中,大致可分為兩個面向討論,語音的內容以及說話者辨認。不同說話者發出相同的字句時,必存在某種共通性用以辨識語音內容,且同時存在相異性以辨別說話者;而人體發聲器官所能產生的變化是有限的,故可視為說話者間存在天生發聲器官間的差異,再經由後天學習改變腔道說出語言。而基於發聲腔道所建立的模型中,明顯且被廣泛使用的聲學特徵即所謂的共振峰,共振峰為聲音於頻譜分佈圖上之峰值所在。;因此,本論文選用共振峰做為研究對象。而由於相較之下子音在頻譜及聽覺上都無明顯的辨識度,故將著重探討不同說話者間母音的共振峰分佈關係。
The way to cognize the speech is still the riddle unsolved. Hence speech is the most direct method to transmit information and applied frequently in daily life, many researchers try to find out ways to deal with the voice, and to comprehend the meanings behind speech. As a result, speech recognition becomes a popular topic in signal processing. Among all the methods of speech recognition, formants detection is the one based on physical structure. This thesis would make deeper analysis and discussion in how to cognize the speech with formants as feature.
The topic of this thesis is to find out the acoustic features with physical meaning. Speech recognition could be discussed in two different aspects, the content of the speech and the speaker identify. When different speakers pronounce the same words, there must be a certain compatibility to integrate the content and the differences to distinguish the speakers. Because the variation of the pronunciation organs is limited, it could be regarded as that there exists natural diverseness between speakers, and then changing the vocal tract to speak by learning. Among the vocal tract models, the significant and wildly used acoustics feature is the so-called formant, which is the peak of the frequency spectrum. According to that, this thesis would focus on the relationships between formants and vowels.
URI: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/41669
Fulltext Rights: 有償授權
Appears in Collections:電機工程學系

Files in This Item:
File SizeFormat 
ntu-98-1.pdf
  Restricted Access
4.34 MBAdobe PDF
Show full item record


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved