請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/91978
標題: | 語音處理的解離表徵學習、多任務學習及通用模型 Disentangled Representation Learning, Multi-task Learning and Universal Modeling for Speech Processing |
作者: | 陳奕禎 Yi-Chen Chen |
指導教授: | 李宏毅 Hung-yi Lee |
關鍵字: | 語音處理,多任務學習,統一模型,多模態,解離表徵, speech processing,multi-task learning,universal model,multimodal,disentangled representation, |
出版年 : | 2023 |
學位: | 博士 |
摘要: | 由於深度學習的成功,愈來愈多基於強大深度模型的語音處理應用影響了我們的生活。然而,這些強大的模型總是針對特定任務而特化,而難以泛化用在其他的任務。因此,對於每個任務,我們都必須分別收集、設計、訓練以及調整所有的資料及模型架構。如果有一種通用模型能夠同時學習並進行多種不同的語音處理任務,那有些任務也許可以透過其他任務習得的技能而更加進步,而且透過不同任務得來的資料也可以被加以利用。但是,這種通用模型需要能從語音訊號中汲取不同種資訊(內容或語者),也能處理不同輸入或輸出模態(語音或文字)的多種任務。因此本論文中,我們針對這兩種問題提出方法。我們藉由對抗式訓練,解離並汲取語音訊號中不同種資訊。我們也提出方法能夠利用多任務訓練使單一模型處理多種任務。 Owing to the success of deep learning, more and more applications built on powerful deep models for speech processing tasks have influenced our lives. However, these powerful models are always task-specific and have limited capability to generalize to other tasks. Therefore, all the data and model architectures are collected, designed, trained, and tuned separately for each task. If a universal model can simultaneously learn and perform multiple speech processing tasks, some tasks might be improved with the related abilities learned from other tasks, and more data from various tasks might be leveraged. However, such universal model requires capabilities to extract different kinds of information from speech signals (content or speaker) and handle various tasks with different input and output modalities (speech or text). Hence, in this thesis, we propose approaches to address these two problems. We disentangle and extract different kinds of information from speech signals with adversarial training. Then we propose approaches to handle various tasks using one single model with multi-task learning. |
URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/91978 |
DOI: | 10.6342/NTU202302463 |
全文授權: | 同意授權(全球公開) |
顯示於系所單位: | 電信工程學研究所 |
文件中的檔案:
檔案 | 大小 | 格式 | |
---|---|---|---|
ntu-112-1.pdf | 4.29 MB | Adobe PDF | 檢視/開啟 |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。