請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/19994完整後設資料紀錄
| DC 欄位 | 值 | 語言 |
|---|---|---|
| dc.contributor.advisor | 張智星 | |
| dc.contributor.author | Yu-Jui Su | en |
| dc.contributor.author | 蘇俞睿 | zh_TW |
| dc.date.accessioned | 2021-06-08T02:38:32Z | - |
| dc.date.copyright | 2018-07-19 | |
| dc.date.issued | 2018 | |
| dc.date.submitted | 2018-07-17 | |
| dc.identifier.citation | [1] http://speech.ee.ntu.edu.tw/DSP2017Autumn/
[2] https://en.wikipedia.org/wiki/Window_function [3] Dempster, Arthur P., Nan M. Laird, and Donald B. Rubin. 'Maximum likelihood from incomplete data via the EM algorithm.' Journal of the royal statistical society. Series B (methodological) (1977): 1-38. [4] Reynolds, Douglas A., Thomas F. Quatieri, and Robert B. Dunn. 'Speaker verification using adapted Gaussian mixture models.' Digital signal processing 10.1-3 (2000): 19-41. [5] Gauvain, J-L., and Chin-Hui Lee. 'Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains.' IEEE transactions on speech and audio processing 2.2 (1994): 291-298. [6] Kenny, Patrick. 'Joint factor analysis of speaker and session variability: Theory and algorithms.' CRIM, Montreal,(Report) CRIM-06/08-13 14 (2005): 28-29. [7] Kenny, Patrick, et al. 'A study of interspeaker variability in speaker verification.' IEEE Transactions on Audio, Speech, and Language Processing 16.5 (2008): 980-988. [8] Dehak, Najim, et al. 'Front-end factor analysis for speaker verification.' IEEE Transactions on Audio, Speech, and Language Processing 19.4 (2011): 788-798. [9] Mika, Sebastian, et al. 'Fisher discriminant analysis with kernels.' Neural networks for signal processing IX, 1999. Proceedings of the 1999 IEEE signal processing society workshop.. Ieee, 1999. [10] Prince, Simon JD, and James H. Elder. 'Probabilistic linear discriminant analysis for inferences about identity.' Computer Vision, 2007. ICCV 2007. IEEE 11th International Conference on. IEEE, 2007. [11] Kenny, Patrick. 'Bayesian speaker verification with heavy-tailed priors.' Odyssey. 2010. [12] Garcia-Romero, Daniel, and Carol Y. Espy-Wilson. 'Analysis of i-vector length normalization in speaker recognition systems.' Twelfth Annual Conference of the International Speech Communication Association. 2011. [13] Lyu, Siwei, and Eero P. Simoncelli. 'Nonlinear extraction of independent components of natural images using radial gaussianization.' Neural computation 21.6 (2009): 1485-1519. [14] https://www.nist.gov/itl/iad/mig/speaker-recognition-evaluation-2010 [15] Garofolo, John S., et al. TIMIT Acoustic-Phonetic Continuous Speech Corpus LDC93S1. Web Download. Philadelphia: Linguistic Data Consortium, 1993. [16] Anjos, André, et al. 'Continuously reproducing toolchains in pattern recognition and machine learning experiments.' (2017). [17] Kanervisto, Anssi, et al. 'Effects of gender information in text-independent and text-dependent speaker verification.' Acoustics, Speech and Signal Processing (ICASSP), 2017 IEEE International Conference on. IEEE, 2017. [18] M. Wang, Y. Chen, Z. Tang, and E. Zhang, 'I-vector based speaker gender recognition,' in 2015 IEEE Advanced Information Technology, Electronic and Au- tomation Control Conference (IAEAC), 2015, pp. 729- 732. | |
| dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/19994 | - |
| dc.description.abstract | 在語者驗證領域中,在不改變聲學模型架構之前提下,以男性與女性之語料分別訓練的性別相關模型取代性別不相關模型,是常見的提升系統辨識率作法之一。然而,在實際運用情形中,由於測試語者的性別是未知的,因此性別分類器在此流程下便扮演了非常重要的角色,其準確度更會直接影響語者驗證系統的表現;而確保系統面對不同性別之仿冒者皆能正確拒絕,亦是此作法相當重要的一項訴求。
為探討不同的「語者性別資訊運用方法」對於語者驗證系統所產生的影響,本論文實作了以 i-向量與機率性線性判別分析模型為語者特徵與評分器之語者驗證系統,與 3 種以 i-向量為基礎的性別分類器。本論文在分析一般使用性別相關模型之語者驗證系統的弱點後,分別於「性別分類器表現良好」與「性別分類器表現不良」之兩大狀況下提出其他若干種不同的性別資訊應用方法,並分析各方法在不同的仿冒者性別組成下之表現,最後亦達成了在各種情況下皆能讓系統表現超越傳統作法之目標。 | zh_TW |
| dc.description.abstract | For speaker verification task, one way to improve system’s accuracy without changing the algorithm of acoustic model is to use gender-dependent model instead of gender-independent one. However, since test speakers’ gender are not available, gender classifier plays an important role since its accuracy directly affects the performance of the whole speaker verification system; furthermore, ensuring that the system can maintain good performance under different gender composition of test speakers is also an important appeal.
To explore the impact of different gender information’s usage on speaker verification system, this paper implemented a speaker verification system using i-vector and PLDA model as speaker feature and scoring model respectively, and 3 i-vector-based gender classifier. After analyzing the weakness of speaker verification system using gender-dependent model in a general way, we proposed several different methods for the application of gender information under the conditions when gender classifier has good and poor performance respectively; moreover, we analysis the performance of each method under different gender composition of test speakers as well. Finally, we reached the goal of making our system achieve better performance than tradition practice under different circumstances. | en |
| dc.description.provenance | Made available in DSpace on 2021-06-08T02:38:32Z (GMT). No. of bitstreams: 1 ntu-107-R05922026-1.pdf: 5357033 bytes, checksum: c1f6b1e8112bc94fa79f245aeb89097d (MD5) Previous issue date: 2018 | en |
| dc.description.tableofcontents | 口試委員審定書............................................................................................................................2
誌謝....................................................................................................................................................3 摘要....................................................................................................................................................4 Abstract.............................................................................................................................................5 目錄....................................................................................................................................................6 圖表目錄..........................................................................................................................................8 第 1 章 緒論.................................................................................................................................10 1.1 語者辨識簡介..............................................................................................................10 1.1.1 語者識別與語者驗證...................................................................................10 1.1.2 語者辨識之標準流程..................................................................................11 1.1.3 語者辨識之評量指標..................................................................................11 1.2 研究簡介.......................................................................................................................13 1.3 章節概要.......................................................................................................................13 第 2 章 語音處理與語者辨識相關技術............................................................................14 2.1 語音訊號之特徵:MFCC........................................................................................14 2.2 高斯混合機率模型....................................................................................................21 2.2.1 高斯混合模型概論.......................................................................................21 2.2.2 通用背景模型與其訓練流程....................................................................23 2.2.3 語者相關模型與其訓練流程....................................................................26 2.3 因子分析與 i-向量.....................................................................................................29 2.3.1 i-向量之概念與其演變歷程......................................................................29 2.3.2 i-向量模型之訓練流程................................................................................32 2.4 i-向量之降維與信道補償措施:線性判別分析.............................................36 2.5 語者評分模型:機率性線性判別分析..............................................................38 2.5.1 機率性線性判別分析之模型訓練流程.................................................38 2.5.2 機率性線性判別分析之評分流程..........................................................41 第 3 章 實驗結果.......................................................................................................................43 3.1 實驗語料與資料配置...............................................................................................43 3.1.1 NIST SRE 2010 語料庫與資料配置簡介.............................................43 3.1.2 TIMIT 語料庫與資料配置簡介...............................................................44 3.2 實驗一:對照組.........................................................................................................45 3.3 實驗二:性別資訊應用方法—當性別分類器表現良好............................48 3.3.1 前人作法(一般常見作法)....................................................................48 3.3.2 方法一:拒絕與註冊者性別不符之測試者.......................................51 3.3.3 方法二:將性別作為新的一維特徵......................................................53 3.4 實驗三:性別分類器實作......................................................................................55 3.4.1 線性判別分析性別分類器.........................................................................55 3.4.2 支持向量機性別分類器..............................................................................56 3.4.3 神經網路性別分類器..................................................................................58 3.5 實驗四:性別資訊應用方法—當性別分類器表現不良............................60 3.5.1 性別分類器的合成.......................................................................................61 3.5.2 方法三:將「性別分數」作為新的特徵............................................64 3.5.3 方法四:檢查註冊音檔的性別分類準確率.......................................65 3.5.4 方法五:檢查測試音檔的男女性別機率差.......................................69 第 4 章 結論與未來方向.........................................................................................................73 參考文獻........................................................................................................................................74 | |
| dc.language.iso | zh-TW | |
| dc.title | 使用性別資訊於語者驗證系統之研究與實作 | zh_TW |
| dc.title | A study and implementation on Speaker Verification System using Gender Information | en |
| dc.type | Thesis | |
| dc.date.schoolyear | 106-2 | |
| dc.description.degree | 碩士 | |
| dc.contributor.oralexamcommittee | 李宏毅,廖元甫 | |
| dc.subject.keyword | 語者驗證,性別資訊,性別分類器,i-向量,機率性線性判別分析, | zh_TW |
| dc.subject.keyword | Speaker Verification,Gender Information,Gender Classifier,i-vector,PLDA, | en |
| dc.relation.page | 75 | |
| dc.identifier.doi | 10.6342/NTU201801578 | |
| dc.rights.note | 未授權 | |
| dc.date.accepted | 2018-07-18 | |
| dc.contributor.author-college | 電機資訊學院 | zh_TW |
| dc.contributor.author-dept | 資訊工程學研究所 | zh_TW |
| 顯示於系所單位: | 資訊工程學系 | |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| ntu-107-1.pdf 未授權公開取用 | 5.23 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
