Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊工程學系
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/19994
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor張智星
dc.contributor.authorYu-Jui Suen
dc.contributor.author蘇俞睿zh_TW
dc.date.accessioned2021-06-08T02:38:32Z-
dc.date.copyright2018-07-19
dc.date.issued2018
dc.date.submitted2018-07-17
dc.identifier.citation[1] http://speech.ee.ntu.edu.tw/DSP2017Autumn/
[2] https://en.wikipedia.org/wiki/Window_function
[3] Dempster, Arthur P., Nan M. Laird, and Donald B. Rubin. 'Maximum likelihood 
 from incomplete data via the EM algorithm.' Journal of the royal statistical 
 society. Series B (methodological) (1977): 1-38.
[4] Reynolds, Douglas A., Thomas F. Quatieri, and Robert B. Dunn. 'Speaker 
 verification using adapted Gaussian mixture models.' Digital signal 
 processing 10.1-3 (2000): 19-41.
[5] Gauvain, J-L., and Chin-Hui Lee. 'Maximum a posteriori estimation for 
 multivariate Gaussian mixture observations of Markov chains.' IEEE 
 transactions on speech and audio processing 2.2 (1994): 291-298.
[6] Kenny, Patrick. 'Joint factor analysis of speaker and session variability: Theory 
 and algorithms.' CRIM, Montreal,(Report) CRIM-06/08-13 14 (2005): 28-29.
[7] Kenny, Patrick, et al. 'A study of interspeaker variability in speaker 
 verification.' IEEE Transactions on Audio, Speech, and Language 
 Processing 16.5 (2008): 980-988.
[8] Dehak, Najim, et al. 'Front-end factor analysis for speaker verification.' IEEE 
 Transactions on Audio, Speech, and Language Processing 19.4 (2011): 788-798.
[9] Mika, Sebastian, et al. 'Fisher discriminant analysis with kernels.' Neural 
 networks for signal processing IX, 1999. Proceedings of the 1999 IEEE signal 
 processing society workshop.. Ieee, 1999.
[10] Prince, Simon JD, and James H. Elder. 'Probabilistic linear discriminant 
 analysis for inferences about identity.' Computer Vision, 2007. ICCV 2007. 
 IEEE 11th International Conference on. IEEE, 2007.
[11] Kenny, Patrick. 'Bayesian speaker verification with heavy-tailed priors.' 
 Odyssey. 2010.
[12] Garcia-Romero, Daniel, and Carol Y. Espy-Wilson. 'Analysis of i-vector length 
 normalization in speaker recognition systems.' Twelfth Annual Conference of the 
 International Speech Communication Association. 2011.
[13] Lyu, Siwei, and Eero P. Simoncelli. 'Nonlinear extraction of independent 
 components of natural images using radial gaussianization.' Neural 
 computation 21.6 (2009): 1485-1519.
[14] https://www.nist.gov/itl/iad/mig/speaker-recognition-evaluation-2010
[15] Garofolo, John S., et al. TIMIT Acoustic-Phonetic Continuous Speech Corpus 
 LDC93S1. Web Download. Philadelphia: Linguistic Data Consortium, 1993.
[16] Anjos, André, et al. 'Continuously reproducing toolchains in pattern recognition 
 and machine learning experiments.' (2017).
[17] Kanervisto, Anssi, et al. 'Effects of gender information in text-independent and 
 text-dependent speaker verification.' Acoustics, Speech and Signal Processing 
 (ICASSP), 2017 IEEE International Conference on. IEEE, 2017.
[18] M. Wang, Y. Chen, Z. Tang, and E. Zhang, 'I-vector based speaker gender 
 recognition,' in 2015 IEEE Advanced Information Technology, Electronic and 
 Au- tomation Control Conference (IAEAC), 2015, pp. 729- 732.
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/19994-
dc.description.abstract在語者驗證領域中,在不改變聲學模型架構之前提下,以男性與女性之語料分別訓練的性別相關模型取代性別不相關模型,是常見的提升系統辨識率作法之一。然而,在實際運用情形中,由於測試語者的性別是未知的,因此性別分類器在此流程下便扮演了非常重要的角色,其準確度更會直接影響語者驗證系統的表現;而確保系統面對不同性別之仿冒者皆能正確拒絕,亦是此作法相當重要的一項訴求。
為探討不同的「語者性別資訊運用方法」對於語者驗證系統所產生的影響,本論文實作了以 i-向量與機率性線性判別分析模型為語者特徵與評分器之語者驗證系統,與 3 種以 i-向量為基礎的性別分類器。本論文在分析一般使用性別相關模型之語者驗證系統的弱點後,分別於「性別分類器表現良好」與「性別分類器表現不良」之兩大狀況下提出其他若干種不同的性別資訊應用方法,並分析各方法在不同的仿冒者性別組成下之表現,最後亦達成了在各種情況下皆能讓系統表現超越傳統作法之目標。
zh_TW
dc.description.abstractFor speaker verification task, one way to improve system’s accuracy without changing the algorithm of acoustic model is to use gender-dependent model instead of gender-independent one. However, since test speakers’ gender are not available, gender classifier plays an important role since its accuracy directly affects the performance of the whole speaker verification system; furthermore, ensuring that the system can maintain good performance under different gender composition of test speakers is also an important appeal.
To explore the impact of different gender information’s usage on speaker verification system, this paper implemented a speaker verification system using i-vector and PLDA model as speaker feature and scoring model respectively, and 3 i-vector-based gender classifier. After analyzing the weakness of speaker verification system using gender-dependent model in a general way, we proposed several different methods for the application of gender information under the conditions when gender classifier has good and poor performance respectively; moreover, we analysis the performance of each method under different gender composition of test speakers as well. Finally, we reached the goal of making our system achieve better performance than tradition practice under different circumstances.
en
dc.description.provenanceMade available in DSpace on 2021-06-08T02:38:32Z (GMT). No. of bitstreams: 1
ntu-107-R05922026-1.pdf: 5357033 bytes, checksum: c1f6b1e8112bc94fa79f245aeb89097d (MD5)
Previous issue date: 2018
en
dc.description.tableofcontents口試委員審定書............................................................................................................................2
誌謝....................................................................................................................................................3
摘要....................................................................................................................................................4
Abstract.............................................................................................................................................5
目錄....................................................................................................................................................6
圖表目錄..........................................................................................................................................8
第 1 章 緒論.................................................................................................................................10
1.1 語者辨識簡介..............................................................................................................10
1.1.1 語者識別與語者驗證...................................................................................10
1.1.2 語者辨識之標準流程..................................................................................11
1.1.3 語者辨識之評量指標..................................................................................11
1.2 研究簡介.......................................................................................................................13
1.3 章節概要.......................................................................................................................13
第 2 章 語音處理與語者辨識相關技術............................................................................14
2.1 語音訊號之特徵:MFCC........................................................................................14
2.2 高斯混合機率模型....................................................................................................21
2.2.1 高斯混合模型概論.......................................................................................21
2.2.2 通用背景模型與其訓練流程....................................................................23
2.2.3 語者相關模型與其訓練流程....................................................................26
2.3 因子分析與 i-向量.....................................................................................................29
2.3.1 i-向量之概念與其演變歷程......................................................................29
2.3.2 i-向量模型之訓練流程................................................................................32
2.4 i-向量之降維與信道補償措施:線性判別分析.............................................36
2.5 語者評分模型:機率性線性判別分析..............................................................38
2.5.1 機率性線性判別分析之模型訓練流程.................................................38
2.5.2 機率性線性判別分析之評分流程..........................................................41
第 3 章 實驗結果.......................................................................................................................43
3.1 實驗語料與資料配置...............................................................................................43
3.1.1 NIST SRE 2010 語料庫與資料配置簡介.............................................43
3.1.2 TIMIT 語料庫與資料配置簡介...............................................................44
3.2 實驗一:對照組.........................................................................................................45
3.3 實驗二:性別資訊應用方法—當性別分類器表現良好............................48
3.3.1 前人作法(一般常見作法)....................................................................48
3.3.2 方法一:拒絕與註冊者性別不符之測試者.......................................51
3.3.3 方法二:將性別作為新的一維特徵......................................................53
3.4 實驗三:性別分類器實作......................................................................................55
3.4.1 線性判別分析性別分類器.........................................................................55
3.4.2 支持向量機性別分類器..............................................................................56
3.4.3 神經網路性別分類器..................................................................................58
3.5 實驗四:性別資訊應用方法—當性別分類器表現不良............................60
3.5.1 性別分類器的合成.......................................................................................61
3.5.2 方法三:將「性別分數」作為新的特徵............................................64
3.5.3 方法四:檢查註冊音檔的性別分類準確率.......................................65
3.5.4 方法五:檢查測試音檔的男女性別機率差.......................................69
第 4 章 結論與未來方向.........................................................................................................73
參考文獻........................................................................................................................................74
dc.language.isozh-TW
dc.title使用性別資訊於語者驗證系統之研究與實作zh_TW
dc.titleA study and implementation on Speaker Verification System 
using Gender Informationen
dc.typeThesis
dc.date.schoolyear106-2
dc.description.degree碩士
dc.contributor.oralexamcommittee李宏毅,廖元甫
dc.subject.keyword語者驗證,性別資訊,性別分類器,i-向量,機率性線性判別分析,zh_TW
dc.subject.keywordSpeaker Verification,Gender Information,Gender Classifier,i-vector,PLDA,en
dc.relation.page75
dc.identifier.doi10.6342/NTU201801578
dc.rights.note未授權
dc.date.accepted2018-07-18
dc.contributor.author-college電機資訊學院zh_TW
dc.contributor.author-dept資訊工程學研究所zh_TW
顯示於系所單位:資訊工程學系

文件中的檔案:
檔案 大小格式 
ntu-107-1.pdf
  未授權公開取用
5.23 MBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved