以高斯混合模型為基礎之呼吸音分類研究

Bor-Chin Liu; 劉柏青

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/32688

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	張璞曾
dc.contributor.author	Bor-Chin Liu	en
dc.contributor.author	劉柏青	zh_TW
dc.date.accessioned	2021-06-13T04:13:32Z	-
dc.date.available	2006-07-27
dc.date.copyright	2006-07-27
dc.date.issued	2006
dc.date.submitted	2006-07-25
dc.identifier.citation	[1] John F. Murray and Jay A. Nadel, Textbook of Respiratory Medicine, Philadelphia: W. B. Saunders, 1994. [2] Pasterkamp, Hans et al, “Respiratory sound-advance beyond the stethoscope,” Am. J. Respir. Crit. Care Med., vol. 156. pp. 974-987, 1997. [3] Dalmay, F. et al, “Acoustic properties of the normal chest,” Eur.Respir. J., 8: 1761-1769, UK, 1995. [4] D. A. Reynolds, “Speaker identification and verification using Gaussian Mixture Speaker Models,” Speech Communication, 17(1995), pp.91-108 [5]M. Bahoura, C. Pelletier ”Respiratory Sounds Classification using Cepstral Analysis and Gaussian Mixture Models”26th Annual international conference of the IEEE EMBS Sanfrancisco, CA,USA.September 1-5,2004 [6] Y. Shabtai-Musih, B.G. James, G. Noam,”Spectral content of forced Expiratory wheezes during air, He, and SF6 breathing in normal humans”, J Appl Physiol. Feb;72(2):629-35, 1992. [7] A. Homs-Corbera, R. Jane, Fiz. J. A, Morera. J, ” Algorithm for time-frequency detection and analysis of wheezes”, Proceedings of the Annual International Conference of the IEEE, vol 4, pp. 23-28, 2000. [8] R. J. Riella, P. Nohama, R. F. Borges, A. L. Stelle,” Automatic wheezeing recognition in recorded lung sounds ”, Proceedings of Annual International Conference of the IEEE, vol 3, pp.17-21, 2003. [9] Styliani A. Taplidou, Leontios. J. Hadjileontiadis,“On Applying Continuous Wavelet Transform in Wheeze Analysis”26th Annual international conference of the IEEE EMBS Sanfrancisco, CA,USA.September 1-5,2004 [10]S. Furui, F. Itakura, and S. Saito,“Talker recognition by longtime averaged speech spectrum,”Electron., Commum. in Japan, Vol. 55-A, no. 10, pp.54-61, 1972. [11]J. Markel, B. Oshika, and A. Gray, Jr.,“Long-term feature averaging for speaker recognition,”IEEE Trans. Acoust., Speech, Singal Processing, vol. ASSP-25, pp.330-337, Aug. 1997. [12]L. Rudasi and S. A. Zahorian,“Test-independent talker identification with neural networks,”in Proc. IEEE ICASSP, May 1991, pp. 389-392. [13]Y. Bennani and P. Gallinari,“On the use of TDNN-extracted features information in talker identification,”in Proc. IEEE ICASSP, May 1991, pp. 385-388. [14]J. Oglesby and J. Mason,“Radial basis function networks for speaker recognition,”in Proc. IEEE ICASSP, May 1985, pp. 387-390. [15]Jyh-Shing Roger Jang, 線上中文教材：音訊處理與辨識Home page：http://neural.cs.nthu.edu.tw/jang/books/audioSignalProcessing [16]Douglas A. Reynolds and Richard C. Rose, “Robust Text-Independent Speaker Identification Using Gaussian Mixture Speaker Models”, IEEE Transactions onspeech and audio processing, Vol. 3, No. 1, pp.72-83, Jan. 1995. [17] Alan V. Oppenheim, Alan S. Willsky and S. Hamid Nawab, Signals and Systems, 2nd ed., Prentice Hall Inc., 1997. [18] Arthur B. Williams and Fred J. Taylor, Electronic Filter Design Handbook, McGRAW-HILL Inc., 1995. [19] http://htk.eng.cam.ac.uk/ [20]”Wheeze Detecting System based on 2D Bilateral Filtering of Spectrogram Image” Feng-Chia Chang, Huey-Dong Wu, Fok-Ching Chong,, June,2004
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/32688	-
dc.description.abstract	傳統最常見的偵測哮鳴音(Wheeze)方法，是根據哮鳴音在時域與頻域上的特性，如持續時間的長短(duration)與出現的頻率範圍(frequency)以及最重要的峰值(peak)特徵來判定是否為哮鳴音。但是用來判斷是否為哮鳴音(Wheeze)峰值的閥值(threshold)常藉著經驗法則去估測決定，因此往往會受到環境雜訊或是與系統的依賴性影響，而有所誤差，且因為將資料轉到相對色階的時頻譜圖，而使整個偵測哮鳴音的計算時間無法達到即時的需求，加上人工判讀時頻譜圖不適合一般大眾使用，這些是需要改善的地方。本研究計劃之目標在於辨識呼吸音，是否為正常呼吸音或是異常呼吸音，像是哮鳴音(Wheeze)。且改善一般傳統哮鳴音偵測方法判斷是否為哮鳴音(Wheeze)峰值的閥值(threshold)常藉著經驗法則去決定，在此利用簡單的數值大小判斷。所以我們利用語音技術的觀念，客觀量化的將正常呼吸音以及異常呼吸音中的哮鳴音(wheeze)看成是兩個不同的語者，看成是一種語者辨認。在本論文中，利用梅爾刻度式倒頻譜參數 (Me-Frequency Cepstral Coefficients，MFCC)，求取呼吸音的特徵參數，再以高斯混合模型（Gaussian Mixture Models，GMM)為分類模型的基礎，以向量量化 (Vector Quantization，VQ)，期望值最大演算法(Expectation Maximization)及最大概似值法（Maximum Likelihood Method）等原理，經過一連串訓練語料的訓練以及最佳化之後，完成正常呼吸音語者與異常呼吸音中的哮鳴音語者(wheeze)兩個語者之特徵向量（Feature Vectors）的統計分佈(Distribution），來做語者辨認的實驗。實驗結果得知在呼吸音訓練語料充足前提之下，使用越多高斯混和數，越能夠代表正常呼吸音語者與哮鳴音語者音語者的聲學特徵，辨識率也就能提高。若訓練語料不充分的情形下，增加高斯混合數，將會達到一個飽和，再增加則會降低辨識率。當然我們也發現當測試語句長度越長，對辨識結果也會越好。今後我們可以用此呼吸音分類方法，作為客觀量化的依據，以利醫師判斷出哮鳴現象，提供醫師客觀的參考，且提供病人及一般大眾較簡單方便的判斷。	zh_TW
dc.description.abstract	Traditional wheezes detection method are based on searching for the frequencies and durations of wheezes or the peaks from successive spectra.In these methods,the discriminative threshold used to identify peaks is fixed empirically.The objective of this study is to classify normal and abnormal (wheezing) respiratory sounds using the Cepstral analysis(Mel Frequency Cepstral Coefficients ,MFCC) is proposed with Gaussian Mixture Models (GMM) method. Gaussian Mixture Models(GMM) is a powerful statistical method massively used for speaker identification. In the respiratory sound that is obtained by training. During the test phase, an unknown sound is compared to all the GMM on the models and the classification decision is based on the Maximum Likelihood (ML) criterion.	en
dc.description.provenance	Made available in DSpace on 2021-06-13T04:13:32Z (GMT). No. of bitstreams: 1 ntu-95-R93921120-1.pdf: 769156 bytes, checksum: 770c5ba40bf69a2a45beeb4b16f71ba7 (MD5) Previous issue date: 2006	en
dc.description.tableofcontents	目錄第一章緒論 1 1.1 研究動機 1 1.2 呼吸音(Respiratory sound)的介紹 3 1.2.1 肺音發生機制 3 1.2.2 肺音頻率特性概述 4 1.2.3 呼吸音的分類 5 1.3 語者辨認概論 7 1.4 本論文研究的主題與研究的方向 9 1.5 章節大要 10 第二章原理與方法 11 2.1 文獻回顧 11 2.1.1 哮鳴音特徵 11 2.1.2 哮鳴音偵測文獻回顧 12 2.2 語者辨認技術的發展 13 2.3 語者辨認的基本技術 15 2.3.1 梅爾頻率倒頻譜係數（MEL-SCALE CEPSTRUM） 15 2.3.2 向量量化 (Vector Quantization，VQ) 19 2.3.3 期望值最大化演算法 (EXPECTATION MAXIMIZATION) 21 2.4 高斯混合模型（GAUSSIAN MIXTURE MODEL，GMM） 23 2.4.1 模型描述 23 2.4.2 模型訓練 24 2.4.3 辨識法則 26 第三章呼吸音量測系統裝置 28 3.1 系統架構 28 3.2 硬體 30 3.3 軟體 36 3.3.1呼吸音分類方法架構 37 第四章實驗結果 38 4.1 實驗背景架構 38 4.1.1 語音資料庫(Corpus) 38 4.1.2 語音訊號的特徵參數擷取 38 4.2 語者辨認系統效能的評估 40 4.3 利用高斯混和模型的語者識別實驗 41 4.3.1 實驗一：高斯分佈個數對辨認的影響 41 4.3.2 實驗二: 訓練語料長度的影響 42 第五章討論與結論 45 第六章未來展望 48 參考文獻 49
dc.language.iso	zh-TW
dc.title	以高斯混合模型為基礎之呼吸音分類研究	zh_TW
dc.title	Respiratory Sounds Classification Base On Gaussian Mixture Models	en
dc.type	Thesis
dc.date.schoolyear	94-2
dc.description.degree	碩士
dc.contributor.coadvisor	吳惠東
dc.contributor.oralexamcommittee	余松年,詹曉龍,林志隆
dc.subject.keyword	梅爾刻度式倒頻譜參數,高斯混合模型,向量量化,期望值最大演算法,最大概似值法,	zh_TW
dc.subject.keyword	wheezes,respiratory sounds,cepstral analysis,Mel Frequency Cepstral Coefficients,Gaussian Mixture Models,Maximum Likelihood,	en
dc.relation.page	52
dc.rights.note	有償授權
dc.date.accepted	2006-07-25
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	電機工程學研究所	zh_TW
顯示於系所單位：	電機工程學系

文件中的檔案：

檔案	大小	格式
ntu-95-1.pdf 目前未授權公開取用	751.13 kB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。