請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/74924
標題: | 自監督式語音表徵分解之研究 A Study of Self-supervised Speech Representation Decomposition |
作者: | Yao-Wen Mao 茅耀文 |
指導教授: | 李琳山(Lin-shan Lee) |
關鍵字: | 語音,自監督式,表徵, speech,self-supervised,representation, |
出版年 : | 2019 |
學位: | 碩士 |
摘要: | 本論文探討如何只使用沒有人工標記的語音訊號來分離訊號中全局性和局部性的資訊使其呈現在不同的表徵上。在普遍的認知中,對於同一個人講出來的語音訊號而言,語者特徵 (Speaker Characteristics) 是一個不隨時間變化的資訊,反過來說,語音內容 (Speech Content) 則是與語者特徵無關,且隨著時間變化的資訊。若能將這兩種資訊分離並產生比較容易進行操作的表徵,則有助於各種語音相關的應用。
本論文先重新檢視特性互相獨立的定義,整理語者特徵與語音內容獨立所需要的假設為何。並根據這些假設,以自編碼器 (Autoencoder) 為基本架構,討論要如何對表徵做限制才有辦法控制其性質,將表徵分解的成全局和局部兩個部分。實驗中以語者識別 (Speaker Identification) 和語音辨識 (Speech Recognition) 為主要的檢驗手段,以系統性的方式來觀察不同方法所造成的影響,比較這些方法在不同面向上的優缺點。 This thesis explores how to separate global and local information in the speech signal without human annotation. For speech signals spoken by the same person, speaker characteristics is a time-invariant information. In contrast, the speech content is a time-varying information which is independent of speaker characteristics. Separating these two types of information into different representations that are easier to manipulate can contribute to a variety of speech-related applications. This thesis first re-examines the definition of the independence of properties and what assumptions are needed. Based on these assumptions, we use Autoencoder as the basic architecture and discuss how to restrict the representations in order to control its properties, and decompose them into global and local parts. In the experiments, we use speaker identification and pseech recognition as the main evaluation methods. We systematically investigate the effect of different methods and compare the advantages and disadvantages of these method in different aspects. |
URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/74924 |
DOI: | 10.6342/NTU201904140 |
全文授權: | 有償授權 |
顯示於系所單位: | 電機工程學系 |
文件中的檔案:
檔案 | 大小 | 格式 | |
---|---|---|---|
ntu-108-1.pdf 目前未授權公開取用 | 670.96 kB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。