整合以特徵及模型之技術以語音查詢口語詞彙

Chun-Hsun Chen; 陳俊勳

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/31395

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	李琳山(Lin-Shan Lee)
dc.contributor.author	Chun-Hsun Chen	en
dc.contributor.author	陳俊勳	zh_TW
dc.date.accessioned	2021-06-13T02:47:37Z	-
dc.date.available	2011-08-08
dc.date.copyright	2011-08-08
dc.date.issued	2011
dc.date.submitted	2011-07-30
dc.identifier.citation	[1] http://zh.wikipedia.org/wiki/%E4%BF%A1%E6%81%AF%E4%B8%8D%E8%B6%B3. [2] http://news.netcraft.com/. [3] Lan H.witten and Eibe Frank, “Data mining,” 2000. [4] http://www.google.com/. [5] http://www.yahoo.com/. [6] Text REtrieval Conference, http://trec.nist.gov/. [7] http://trec.nist.gov/pubs/trec9/sdrt9_slides/. [8] Carolina Parada, Abhinav Sethy, and Bhuvana Ramabhadran, “Query-by-example spoken term detection for OOV terms,” in ASRU, 2009. [9] Timothy J. Hazen, Wade Shen, and Christopher White, “Query-by-example spoken term detection using phonetic posteriorgram templates,” in ASRU, 2009. [10] Yaodong Zhang and James R. Glass, “Unsupervised spoken keyword spotting via segmental dtw on gaussian posteriorgrams,” in ASRU, 2009. [11] W. Shen, C. White, and T. Hazen, “A comparison of query-by-example methods for spoken term detection,” in INTERSPEECH, 2009. [12] Hui Lin, Alex Stupakov, and Jeff Bilmes, “Improving multi-lattice alignment based spoken keyword spotting,” in ICASSP, 2009. [13] Hui Lin, Alex Stupakov, and Jeff Bilmes, “Spoken keyword spotting via multi-lattice alignment,” in INTERSPEECH, 2008. [14] Ciprian Chelba and Alex Acero, “Position specific posterior lattices for indexing speech,” in ACL, 2005. [15] Yi cheng Pan, Hung lin Chang, and Lin shan Lee, “Analytical comparison between position specific posterior lattices and confusion networks based onwords and sub- word units for spoken document indexing,” in ASRU, 2007. [16] Yi cheng Pan, “One-pass and word-graph-based search algorithms for large vocabulary continuous mandarin speech recognition,” in M.S. thesis, National Taiwan University, 2001. [17] Cambridge University Engineering Dept. (CUED), Machine Intelligence Laboratory, ”HTK”, http://htk.eng.cam.ac.uk/. [18] Thomas Hofmann, “Probabilistic latent semantic indexing,” in SIGIR, 1999. [19] I-Hung Lin, “Acoustic characteristics based on probabilistic latent semantic analysis with applications in speech recognition,” in M.S. thesis, National Taiwan University, 2010. [20] J. T. Foote, “Decision-tree probability modeling for hmm speech recognition,” in Ph.D. dissertation, Brown University, Providence, RI, 1993. [21] A. Buzo Y. Linde and R. M. Gray, “An algorithm for vector quantizer design,” in IEEE Transactions on Communications, vol. 28, pp. 84–95, 1980. 66 [22] K. K. Paliwal and B. S. Atal, “Efficient vector quantization of lpc parameters at 24 bits/frame,” in IEEE International Conference on Acoustics, Speech and Signal Processing, 1991.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/31395	-
dc.description.abstract	在現代手持式裝置日漸普遍的情況下，語音檢索的應用已經相當廣泛，像是手機聯絡人查詢、商家查詢等等，也因此語音檢索的研究就成為了一個非常重要的課題。而本論文針對以口語詞彙作語音查詢的部分去做研究，藉由不同的檢索方法，我們希望能夠讓系統更完善。在以模型為基礎的技術方面，我們使用「位置特定事後機率詞圖」做研究，將語音查詢問句及語音文件都辨識為位置特定事後機率詞圖，並針對不同的語音搜尋比對方式、距離度量方式、分數權重、多查詢詞模組做實驗。再來，我們在以特徵為基礎的技術方面做研究，不用模型少了辨識系統，我們只針對語音查詢問句及語音文件的特徵做比對。最後我們結合兩者，以二階段架構進行整合。最後的結果，以二階段串聯架構表現最佳。其對於單一模組的語音查詢問句，檢索最佳平均準確率是詞典內語音查詢問句為 60.81% ，詞典外語音查詢問句為 41.68% 。對於結合五模組的語音查詢問句，檢索最佳平均準確率是詞典內語音查詢問句為 81.24% ，詞典外語音查詢問句為 61.85% 。	zh_TW
dc.description.provenance	Made available in DSpace on 2021-06-13T02:47:37Z (GMT). No. of bitstreams: 1 ntu-100-R98922013-1.pdf: 2794791 bytes, checksum: 86328d8559742ee2c4cf9f5bf264572b (MD5) Previous issue date: 2011	en
dc.description.tableofcontents	口試委員會審定書........................................ i 中文摘要 .............................................. ii 一、導論 ............................................... 1 1.1研究背景 ............................................ 1 1.2本論文研究方向 ...................................... 2 1.3本論文研究方法與成果 ................................ 3 1.4論文章節架構 ........................................ 4 二、背景知識介紹 ....................................... 5 2.1資訊檢索背景知識介紹 ................................ 5 2.2語音檢索背景知識介紹 ................................ 7 2.2.1簡述 .............................................. 7 2.2.2語音索引 .......................................... 8 2.2.3語音檢索的主軸 .................................... 9 2.2.4辭典外 (Out of Vocabulary, OOV)查詢詞 ............ 10 2.2.5依例查詢 (QueryByExample) ........................ 10 2.3資訊檢索評估機制 .............................. .....11 三、以位置特定事後機率詞圖為基礎的依例查詢語音搜尋 .................................................... 14 3.1以詞圖為基礎的依例查詢語音搜尋架構 ................. 14 3.2位置特定事後機率詞圖 ............................... 15 3.2.1以詞為單位之位置特定事後機率詞圖 ................. 16 3.2.2以次詞為單位之位置特定事後機率詞圖 ............... 16 3.3語音搜尋比對方式分析 ............................... 18 3.3.1 N連詞比對 (N-gramMatching) ...................... 19 3.3.2動態規畫 (DynamicProgramming,DP) ................. 22 3.3.3修訂動態規畫1 (Revised Dynamic Programming 1, RDP1) ................................................. 24 3.3.4修訂動態規畫2(RDP2) .............................. 25 3.4距離度量方式分析 ................................... 26 3.5以詞圖為基礎的語音搜尋實驗 ......................... 29 3.5.1實驗語料與測試環境介紹 ........................... 29 3.5.2實驗方法與結果分析 ............................... 29 3.5.3本章總結 ......................................... 38 四、以特徵為基礎的依例查詢語音搜尋 .................... 39 4.1以特徵為基礎的語音搜尋架構 ......................... 39 4.2特徵單位介紹 ....................................... 40 4.2.1梅爾倒頻譜係數 ................................... 40 4.2.2以機率式潛藏語意分析做分群 ....................... 40 4.3以特徵為基礎的語音搜尋實驗 ......................... 45 4.3.1實驗語料與測試環境介紹 ........................... 45 4.3.2實驗方法與結果分析 ............................... 46 4.3.3本章總結 ......................................... 50 五、結合以特徵及位置特定事後機率詞圖為基礎的二階段依例查詢語音搜尋 .................................................51 5.1兩階段語音搜尋 ..................................... 51 5.2並聯架構 ........................................... 51 5.3串聯架構 ........................................... 54 5.4結合以特徵為基礎及以詞圖為基礎的二階段依例查詢語音搜尋實驗 .................................................... 57 5.4.1實驗語料與測試環境介紹 ........................... 57 5.4.2實驗方法與結果分析 ............................... 57 5.4.3本章總結 ......................................... 62 六、結論與展望 ........................................ 63 6.1結論 ............................................... 63 6.2未來展望 ........................................... 64 參考文獻 .............................................. 65
dc.language.iso	zh-TW
dc.subject	口語詞彙	zh_TW
dc.subject	語音查詢	zh_TW
dc.subject	語音搜尋	zh_TW
dc.subject	spoken query	en
dc.subject	STD	en
dc.subject	SDR	en
dc.subject	QBE	en
dc.title	整合以特徵及模型之技術以語音查詢口語詞彙	zh_TW
dc.title	Integration of Feature-based and Model-based Approaches for Spoken Term Detection with Spoken Query	en
dc.type	Thesis
dc.date.schoolyear	99-2
dc.description.degree	碩士
dc.contributor.oralexamcommittee	徐宏民(Hung-Min Hsu),洪一平(Yi-Ping Hung)
dc.subject.keyword	語音搜尋,語音查詢,口語詞彙,	zh_TW
dc.subject.keyword	STD,SDR,QBE,spoken query,	en
dc.relation.page	67
dc.rights.note	有償授權
dc.date.accepted	2011-08-01
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	資訊工程學研究所	zh_TW
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
ntu-100-1.pdf 未授權公開取用	2.73 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。