獨立低階次矩陣分析結合強化學習於頻域音訊分離之分析探討

Guan-Yu Chen; 陳冠宇

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/85655

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	王昭男(Chao-Nan Wang)
dc.contributor.advisor	王昭男(Chao-Nan Wang \| wangcn@ntu.edu.tw \| ),
dc.contributor.author	Guan-Yu Chen	en
dc.contributor.author	陳冠宇	zh_TW
dc.date.accessioned	2023-03-19T23:20:43Z	-
dc.date.copyright	2022-07-07
dc.date.issued	2022
dc.date.submitted	2022-06-24
dc.identifier.citation	[1] Comon, P.: Independent component analysis, a new concept? Signal Processing, 36 (1994), 287–314. [2] yvärinen, A.; Karhunen, J.; Oja, E.: Independent Component Analysis. John Wiley & Sons, The United States of America, 2001. [3] Cardoso, J.-F.: Infomax and maximum likelihood for blind source separation. IEEE Signal Process. Lett., 4 (4) (1997), 112–114 [4] Kim, T.; Eltoft, T.; Lee, T.-W.: Independent vector analysis: An extension of ICA to multivariate components, in Proc. International Conference on Independent Component Analysis and Blind Source Separation, 2006 (LNCS3889). Springer, March 2006, 165–172. [5] Ono, N.; Miyabe, S.: Auxiliary-function-based independent component analysis for super-Gaussian sources, in Proc. International Conference on Latent Variable Analysis and Signal Separation. Springer, 2010, 165–172. [6] Ono, N.: Stable and fast update rules for independent vector analysis based on auxiliary function technique, in Proc. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, October 2011, 189–192. [7] Hunter, D.R.; Lange, K.: Quantile regression via an MM algorithm. Journal of Computational and Graphical Statistics, 9 (1) (2000), 60–77. [8] Hunter, D.R.; Lange, K.: A tutorial on MM algorithms. The American Statistician, 58 (1) (2004), 30–37 [9] Sun, Y.; Babu, P.; Palomar, D.P.: Majorization-minimization algorithms in signal processing, communications, and machine learning. IEEE Trans Signal Process., 65 (3) (2017), 794–816. [10] Lee, D.; Seung, H.: Algorithms for non-negative matrix factorization, in Advances in Neural Information Processing Systems, vol. 13, 2001, 556–562. [11] Févotte, C.; Bertin, N.; Durrieu, J.-L.: Nonnegative matrix factorization with the Itakura-Saito divergence: with application to music analysis. Neural Computation, 21 (3) (2009), 793–830. [12] Kitamura, D.; Ono, N.; Sawada, H.; Kameoka, H.; Saruwatari, H.: Determined blind source separation unifying independent vector analysis and nonnegative matrix factorization. IEEE/ACM Trans. Audio, Speech, Language Process., 24 (9) (2016), 1626–1641. [13] Kitamura, D.; Ono, N.; Sawada, H.; Kameoka, H.; Saruwatari, H.: Determined blind source separation with independent low-rank matrix analysis, in Makino, S. Ed., Audio Source Separation. Springer, Cham, Switzerland, March 2018, 125-155. [14] Sawada, H., Ono, N., Kameoka, H., Kitamura, D. and Saruwatari, H.; A review of blind source separation methods: two converging routes to ILRMA originating from ICA and NMF; SIP (2019), vol. 8, e12, 2019, 1-14. [15] Mitsui, Y., Kitamura, D., Takamune, N., Saruwatari, H., Takahashi, Y., and Kondo, K.;'Independent low-rank matrix analysis based on parametric majorizationequalization algorithm,' 2017 IEEE 7th International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP), 2017. [16] Sutton, R.S.; Barto, A.G. Reinforcement Learning: An Introduction, 2nd ed.; MIT Press: Cambridge, MA, USA, 2014. [17] Yeh, Y.-L.; Yang, P.-K. Design and Comparison of Reinforcement-Learning-Based Time-Varying PID Controllers with Gain-Scheduled Actions. Machines 2021, 9, 319. [18] Pyroomacoustics(audio software package) 。 Room simulation 。檢自https://pyroomacoustics.readthedocs.io/en/pypirelease/pyroomacoustics.room.html(SEP.10, 2021) [19] Community-Based Signal Separation Evaluation Campaign。歌曲音訊資料集。檢自 https://sisec.inria.fr/(SEP.10, 2021) [20] CMU_ARCTIC speech synthesis databases 。語音音訊資料集。檢自http://www.festvox.org/cmu_arctic/ (MAR.12, 2021) [21] 北村大地個人網站。樂器音訊資料集。檢自 http://dkitamura.net/dataset.html (MAR.12, 2021) [22] 林昱偉。「獨立成分分析結合子空間增強於時域音訊分離之分析探討」。碩士論文，國立臺灣大學工程科學及海洋工程學研究所， 2018 。https://hdl.handle.net/11296/4gr5s7。 [23] Kagami, H., Kameoka, H., and Yukawa, M., 'Joint Separation and Dereverberation of Reverberant Mixtures with Determined Multichannel Non-Negative Matrix Factorization,' 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2018, pp. 31-35.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/85655	-
dc.description.abstract	音訊分離一直是訊號處理上想要達成的目標，若能從眾多音訊中擷取自己想要的訊號，在後續也有相當廣泛的應用，本文利用獨立低階次矩陣分析(ILRMA)結合強化學習(Reinforcement Learning)，對未知聲源的訊號進行盲訊號分離。獨立向量分析透過統計特性對於聲源的獨立性假設進行分離，非負矩陣分解將原本音訊時頻圖分解成音訊時域圖和頻譜圖，同時結合此兩種方法稱為獨立低階次矩陣分析。修正版本的獨立低階次矩陣分析為迭代式的指數參數變化，調整指數參數可有效解決時頻圖特定頻率以上錯置的問題。本研究方法使用修正指數的獨立低階次矩陣分析結合強化學習，透過強化學習的Q-learning動態更動指數參數來達到更好的效果。本研究針對(1)複數人聲(2)複數樂器(3)歌聲分離進行音訊分離的實驗，實驗將用(1)獨立低階次矩陣分析(2)修正指數的獨立低階次矩陣分析(3)修正指數的獨立低階次矩陣分析結合強化學習這三個方法進行比較，並使用音訊時域圖、音訊時頻圖、計算時間、訊號失真比變化量(∆SI-SDR)來分析分離的好壞程度。分析結果可以得知修正指數的獨立低階次矩陣分析可以比原始的獨立低階次矩陣分析有明顯的改善，並從音訊時域圖和音訊時頻圖就可以看出差異。然後修正指數的獨立低階次矩陣分析結合強化學習和修正指數的獨立低階次矩陣分析來比較的話，從訊號失真比變化量可以看出加入Q-learning可以讓數值稍微提升，證實本研究可改善音訊分離品質。	zh_TW
dc.description.abstract	Audio source separation has been a goal of signal processing for decades. If we can separate sound clearly, there will be many applications. The research focuses on combining Independent low-rank matrix analysis (ILRMA) and Reinforcement learning (RL) to separate unknown audio signal. Independent vector analysis(IVA) uses independent probability between each audio source to separate audio signal. In the other method, nonnegative matrix factorization(NMF) is based on separating spectrogram to spectral pattern and audio time-varying gain. As a convergence point of these two methods, independent low-rank matrix analysis (ILRMA) has been proposed, which integrates IVA and NMF in a clever way. Then, ILRMA with parametric index number adjustment can solve spectrogram block permutation problem. The research use Q-learning with dynamic change parametric index number to reach better effect. Finally, the research discusses experimental results, such as (1) speech separation (2) musical instrument separation (3) song separation at three different methods (ILRMA, ILRMA with index number adjustment, ILRMA with index number adjustment and Q-learning). Experimental results show ILRMA with index number adjustment and Q-learning can get better audio separation effect.	en
dc.description.provenance	Made available in DSpace on 2023-03-19T23:20:43Z (GMT). No. of bitstreams: 1 U0001-2106202219353500.pdf: 6828561 bytes, checksum: c16eb9f430e9057c541c14da42cb0edc (MD5) Previous issue date: 2022	en
dc.description.tableofcontents	誌謝.............................................................................. i 中文摘要.......................................................................... ii ABSTRACT ........................................................................ iii 目錄.............................................................................. iv 圖目錄............................................................................ vi 表目錄............................................................................ xi 第一章緒論.........................................................................1 1.1 研究動機 .......................................................................1 1.2 文獻回顧 .......................................................................2 1.3 論文架構 ...................................................................................3 第二章訊號分離理論 .................................................................4 2.1 獨立成分分析 ....................................................................4 2.2 獨立向量分析 ....................................................................6 2.2.1 獨立向量分析的目標函數..........................................................6 2.2.2 輔助變數法.....................................................................7 2.2.3 獨立向量分析的迭代式............................................................8 2.3 非負矩陣分解 ...................................................................11 2.3.1 非負矩陣分解的目標函數.........................................................11 2.3.2 非負矩陣分解的迭代式...........................................................12 2.4 獨立低階次矩陣分析 .............................................................13 2.5 訊號失真比 .....................................................................14 第三章音訊分離方法改善 .............................................................16 3.1 指數修正的獨立低階次矩陣分析 ....................................................16 3.2 指數修正的獨立低階次矩陣分析結合強化學習 .........................................17 第四章音訊分離結果 ................................................................21 4.1 音訊分離演算法 .................................................................21 4.2 室內聲學計算的混合聲音 ..........................................................23 4.2.1 室內聲學計算情況..............................................................23 4.2.2 室內聲學計算混合音訊的分離結果.................................................25 4.3 無響室測量的混合聲音 ...........................................................42 4.3.1 無響室實驗架設................................................................42 4.3.2 無響室混合音訊的分離結果.......................................................43 第五章結論與未來展望 ...............................................................62 5.1 結論 ..........................................................................62 5.2 未來展望 ......................................................................63 參考資料...........................................................................64
dc.language.iso	zh-TW
dc.title	獨立低階次矩陣分析結合強化學習於頻域音訊分離之分析探討	zh_TW
dc.title	A Processing of Frequency Domain Audio Source Separation Based on ILRMA and Reinforcement Learning	en
dc.type	Thesis
dc.date.schoolyear	110-2
dc.description.degree	碩士
dc.contributor.oralexamcommittee	宋家驥(Chia-Chi Sung),湯耀期(Yao-Chi Tang),謝傳璋(Chuan-Zhang Xie)
dc.subject.keyword	盲訊號分離,音源分離,獨立低階次矩陣分析,強化學習,Q學習,	zh_TW
dc.subject.keyword	Blind source separation(BSS),Audio source separation,Independent low-rank matrix analysis (ILRMA),Reinforcement learning,Q-learning,	en
dc.relation.page	66
dc.identifier.doi	10.6342/NTU202201041
dc.rights.note	同意授權(全球公開)
dc.date.accepted	2022-06-27
dc.contributor.author-college	工學院	zh_TW
dc.contributor.author-dept	工程科學及海洋工程學研究所	zh_TW
dc.date.embargo-lift	2027-06-24	-
顯示於系所單位：	工程科學及海洋工程學系

文件中的檔案：

檔案	大小	格式
U0001-2106202219353500.pdf 此日期後於網路公開 2027-06-24	6.67 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。