Skip navigation

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets

Learn More
DSpace logo
English
中文
  • Browse
    • Communities
      & Collections
    • Publication Year
    • Author
    • Title
    • Subject
    • Advisor
  • Search TDR
  • Rights Q&A
    • My Page
    • Receive email
      updates
    • Edit Profile
  1. NTU Theses and Dissertations Repository
  2. 工學院
  3. 工程科學及海洋工程學系
Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/85655
Title: 獨立低階次矩陣分析結合強化學習於頻域音訊分離之分析探討
A Processing of Frequency Domain Audio Source Separation Based on ILRMA and Reinforcement Learning
Authors: Guan-Yu Chen
陳冠宇
Advisor: 王昭男(Chao-Nan Wang)
Keyword: 盲訊號分離,音源分離,獨立低階次矩陣分析,強化學習,Q學習,
Blind source separation(BSS),Audio source separation,Independent low-rank matrix analysis (ILRMA),Reinforcement learning,Q-learning,
Publication Year : 2022
Degree: 碩士
Abstract: 音訊分離一直是訊號處理上想要達成的目標,若能從眾多音訊中擷取自己想要的訊號,在後續也有相當廣泛的應用,本文利用獨立低階次矩陣分析(ILRMA)結合強化學習(Reinforcement Learning),對未知聲源的訊號進行盲訊號分離。 獨立向量分析透過統計特性對於聲源的獨立性假設進行分離,非負矩陣分解將原本音訊時頻圖分解成音訊時域圖和頻譜圖,同時結合此兩種方法稱為獨立低階次矩陣分析。修正版本的獨立低階次矩陣分析為迭代式的指數參數變化,調整指數參數可有效解決時頻圖特定頻率以上錯置的問題。本研究方法使用修正指數的獨立低階次矩陣分析結合強化學習,透過強化學習的Q-learning動態更動指數參數來達到更好的效果。 本研究針對(1)複數人聲(2)複數樂器(3)歌聲分離進行音訊分離的實驗,實驗將用(1)獨立低階次矩陣分析(2)修正指數的獨立低階次矩陣分析(3)修正指數的獨立低階次矩陣分析結合強化學習這三個方法進行比較,並使用音訊時域圖、音訊時頻圖、計算時間、訊號失真比變化量(∆SI-SDR)來分析分離的好壞程度。 分析結果可以得知修正指數的獨立低階次矩陣分析可以比原始的獨立低階次矩陣分析有明顯的改善,並從音訊時域圖和音訊時頻圖就可以看出差異。然後修正指數的獨立低階次矩陣分析結合強化學習和修正指數的獨立低階次矩陣分析來比較的話,從訊號失真比變化量可以看出加入Q-learning可以讓數值稍微提升,證實本研究可改善音訊分離品質。
Audio source separation has been a goal of signal processing for decades. If we can separate sound clearly, there will be many applications. The research focuses on combining Independent low-rank matrix analysis (ILRMA) and Reinforcement learning (RL) to separate unknown audio signal. Independent vector analysis(IVA) uses independent probability between each audio source to separate audio signal. In the other method, nonnegative matrix factorization(NMF) is based on separating spectrogram to spectral pattern and audio time-varying gain. As a convergence point of these two methods, independent low-rank matrix analysis (ILRMA) has been proposed, which integrates IVA and NMF in a clever way. Then, ILRMA with parametric index number adjustment can solve spectrogram block permutation problem. The research use Q-learning with dynamic change parametric index number to reach better effect. Finally, the research discusses experimental results, such as (1) speech separation (2) musical instrument separation (3) song separation at three different methods (ILRMA, ILRMA with index number adjustment, ILRMA with index number adjustment and Q-learning). Experimental results show ILRMA with index number adjustment and Q-learning can get better audio separation effect.
URI: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/85655
DOI: 10.6342/NTU202201041
Fulltext Rights: 同意授權(全球公開)
metadata.dc.date.embargo-lift: 2027-06-24
Appears in Collections:工程科學及海洋工程學系

Files in This Item:
File SizeFormat 
U0001-2106202219353500.pdf
  Until 2027-06-24
6.67 MBAdobe PDF
Show full item record


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved