基於位置特定事後機率詞圖及潛藏語意分析之語音文件檢索

Hung-Lin Chang; 張弘霖

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/37099

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	李琳山(Lin-Shan Lee)
dc.contributor.author	Hung-Lin Chang	en
dc.contributor.author	張弘霖	zh_TW
dc.date.accessioned	2021-06-13T15:19:11Z	-
dc.date.available	2008-07-25
dc.date.copyright	2008-07-25
dc.date.issued	2008
dc.date.submitted	2008-07-22
dc.identifier.citation	[1] http://www.youtube.com/ [2] Lan H.witten, Eibe Frank, Data Mining, ch.1, p.2. Morgan Kaufrann, 2000 [3] Text REtrieval Conference, http://trec.nist.gov/ [4] John S. Garofolo, Cedric G. P. Auzanne, Ellen M. Voorhees, “The TREC Spoken Document Retrieval Track: A Success Story”, 2000 [5] Cross-Language Evaluation Forum, http://www.clef-campaign.org/ [6] NII Test Collection for Information Retrieval, http://research.nii.ac.jp/ntcir/ [7] ACM Special Interest Group on Information Retrieval, http://www.sigir.org/ [8] http://www.dcs.shef.ac.uk/spandh/projects/thisl/overview-oct98/ [9] http://mi.eng.cam.ac.uk/research/Projects/Video_Mail_Retrieval_Voice/ [10] http://mi.eng.cam.ac.uk/research/projects/Audio_Document_Processing/ [11] http://svr-www.eng.cam.ac.uk/research/projects/mdr/ [12] http://www.informedia.cs.cmu.edu/ [13] http://speechfind.utdallas.edu/ [14] http://web.sls.csail.mit.edu/lectures/ [15] Sean Colbath, Francis Kubala, “Rough'n'Ready: a meeting recorder and browser”, ACM Computing Surveys(CSUR), 1999 [16] John Choi, Don Hindle, Julia Hirschberg, Ivan Magrin-Chagnolleau, Christine Nakatani, Fernando Pereira, Amit Singhal, Steve Whittaker, “SCAN - Speech Content Based Audio Navigator: A Systems Overview”, ICSLP 1998 [17] Stanley Boykin, Andrew Merlino, “Improving Broadcast News Segmentation Processing”, Multimedia Computing Media, 1999 [18] Jean-Manuel Van Thong, Pedro J. Moreno, Beth Logan, Blair Fidler, Katrina Maffey and Matthew Moores, 'SPEECHBOT: An Experimental Speech-Based Search Engine for Multimedia Content in the Web', 2001 [19] J. Garofolo, TREC-9 Spoken Document Retrieval Track, National Institute of Standards and Technology, http://trec.nist.gov/pubs/trec9/sdrt9_slides/sld001.htm [20] Takaaki Hori, I.Lee Hetherington, Timothy J. Hazen, and James R. Glass, “OPEN-VOCABULARY SPOKEN UTTERANCE RETRIEVAL USING CONFUSION NETWORKS”, ICASSP 2007, pp. 73-76 [21] Singhal, Amit, “Modern Information Retrieval: A Brief Overview” [22] Ciprian Chelba, “Spoken Document Retrieval and Browsing”, 2007 [23] L. Mangu, E. Brill, and A. Stolcke, “Finding consensus in speech recognition: Word Error minimization and other applications of confusion networks”, Computer Speech and Language, vol. 14, no. 4, pp. 373-400, Oct 2000 [24] Ricardo Baeza-Yates and Berthier Ribeiro-Neto, 1999. Modern Information Retrieval, chapter 2, pages 27-30. Addison Wesley, New York [25] Sergey Brin and Lawrence Page. 1998. The anatomy of a large-scale hypertextual Web search engine. Computer Networks and ISDN Systems [26] Ciprian Chelba and Alex Acero, “Position Specific Posterior Lattices for Indexing Speech”, in ACL, Ann Arbor, 2005, pp.443-450 [27] L.R. Rabiner. 1989. A tutorial on hidden markov models and selected applications in speech recognition. In Proceedings IEEE, volume 77(2), pages 257-285 [28] K. Ng, Subword-based Approaches for Spoken Document Retrieval, Ph.D thesis, Massachusetts Institute of Technology, 2000 [29] F. Wessel, R. Schluter, K. Macherey, and H. Ney, “Confidence measures for large vocabulary continuous speech recognition,” SAP, vol. 9, no. 3, pp. 288-298, Mar 2001 [30] Yao Qian, Frank K. Soong, and Tan Lee, “Tone-enhanced generalized character posterior probability (GCPP) for Cantonese LVCSR,” ICASSP, 2006, pp. 133-136 [31] Y.-C. Pan, “One-pass and word-graph-based search algorithms for large vocabulary continuous mandarin speech recognition”, M.S. thesis, National Taiwan University, 2001 [32] Thomas Hofmann, Probabilistic Latent Semantic Indexing, Proceedings of the Twenty-Second Annual International SIGIR Conference on Research and Development in Information Retrieval (SIGIR-99), 1999 [33] S. Deerwester, S. T. Dumais, G. W. Furnas, Landauer. T. K., and R. Harshman. Indexing by latent semantic analysis. Journal of the American Society for Information Science, 41, 1990 [34] Ricardo Baeza-Yates and Berthier Ribeiro-Neto, 1999. Modern Information Retrieval, chapter 5, page 117. Addison Wesley, New York [35] http://ckipsvr.iis.sinica.edu.tw/
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/37099	-
dc.description.abstract	結合多種影音效果的多媒體文件，隨著網路世界的流通，正逐漸成為身處數位時代中的人們，吸取新資訊的重要來源。而這些多媒體文件所帶有的語音訊息，更常常扮演著其中最為關鍵的角色。但是瀏覽以語音資訊為主的文件，並沒有辦法像瀏覽文字文件一樣地方便，因此以快速而準確地從眾多語音文件中，找到使用者想要知道的資訊為目標的語音文件檢索相關技術，自然也就日漸受到大家的注目了。比起文字文件檢索，語音文件檢索尚須面對在自動語音辨識過程中，由於環境雜訊或是語者說話型態等因素，而可能會產生的高辨識錯誤率，使得檢索的難度更加上升。因此，如何為語音文件建立起良好的索引方式，來盡可能保存有助於語音文件檢索工作進行的資訊，便是本論文的第一項主題。以詞為單位之位置特定事後詞圖，充分使用候選詞於詞圖上的位置資訊，在過去的研究中展現出在語音文件檢索工作上的優勢。本論文則是更進一步，思考以字元或是音節等次詞單元作為建立位置特定事後機率詞圖的基本單位，不但順利地解決了原先以詞為單位之位置特定事後機率詞圖所無法處理的詞典外詞彙問題，而同時也能達到更好的檢索效果。文件檢索系統於實際應用時，常常有不少與查詢指令高度相關的文件，因為沒有包含查詢詞，而無法被基於字面比對的檢索系統所檢索出來。本論文的第二項主題便是以此為出發點，引入機率式潛藏語意分析模型，利用詞彙在潛藏觀念上的機率分佈，建立有效的詞彙相關度量測方法。接著我們再將其應用至以潛藏語意為基礎的檢索模式上，適度解決了字面比對無法處理的查詢詞與檢索目標文件無法匹配的問題。	zh_TW
dc.description.provenance	Made available in DSpace on 2021-06-13T15:19:11Z (GMT). No. of bitstreams: 1 ntu-97-R95922040-1.pdf: 1902984 bytes, checksum: 0e6599c3e051237c3e57ddc157bb0639 (MD5) Previous issue date: 2008	en
dc.description.tableofcontents	誌謝 i 摘要 iii 目錄 v 圖目錄 ix 表目錄 xi 第一章導論 1 1.1 研究動機 1 1.2 國際相關研究 2 1.3 本論文之研究方向及背景 3 1.4 章節大綱 5 第二章基本背景知識介紹 7 2.1 文件檢索技術的基本概念 7 2.2 語音文件索引之介紹 9 2.2.1文件索引對於檢索工作之重要性 9 2.2.2 文字與語音索引之差異 10 2.2.3典型的語音索引方式 11 2.2.3.1 最佳序列 11 2.2.3.2 詞圖 13 2.2.3.3 混淆網路 14 2.3資訊檢索之評估機制 16 2.3.1準確率與召回率 17 2.3.2文字檢索會議評估機制 18 2.4 本章總結 19 第三章位置特定事後機率詞圖 21 3.1位置資訊對於文件檢索的重要性 21 3.2 以詞為單位之位置特定事後機率詞圖 25 3.2.1 軟性匹配 25 3.2.2 特定位置之事後機率 26 3.3 以次詞為單位之位置特定事後機率詞圖 28 3.3.1 詞典外詞彙問題 28 3.3.2 次詞單元 30 3.3.3 以次詞單元為單位之位置特定事後機率 31 3.4 基於位置特定事後機率詞圖之語音文件的索引與檢索 35 3.4.1 以位置特定事後機率詞圖作為語音文件之索引 35 3.4.2 以位置特定事後機率詞圖為基礎之語音文件檢索 35 3.5 位置特定事後機率詞圖應用於語音文件檢索之實驗 37 3.5.1 實驗語料與測試環境介紹 37 3.5.2 實驗方法與結果分析 37 3.5.2.1 綜合詞典內詞彙與詞典外詞彙查詢指令之檢索 39 3.5.2.2 詞典內詞彙查詢指令之檢索 42 3.5.2.3 詞典外詞彙查詢指令之檢索 45 3.5 本章總結 48 第四章混淆網路與位置特定事後機率詞圖作為語音文件索引之比較 49 4.1 混淆網路與位置特定事後機率詞圖概念的回顧 49 4.2 混淆網路與位置特定事後機率詞圖架構差異之分析 50 4.3 混淆網路與位置特定事後機率詞圖作為語音文件索引之實驗 54 4.3.1 實驗語料與測試環境介紹 54 4.3.2 實驗方法與結果分析 54 4.3.2.1 語音索引之涵蓋性 55 4.3.2.2 語音文件之檢索效能 56 4.3.2.3 索引儲存空間與檢索效能之綜合比較 60 4.4 本章總結 62 第五章基於機率式潛藏語意分析模型之相關詞彙抽取 63 5.1 機率式潛藏語意分析 63 5.1.1 傳統潛藏語意分析的概念 63 5.1.2 機率式潛藏語意分析的概念 65 5.1.2.1 潛藏觀念模型 65 5.1.2.2 使用最大期望值演算法求取潛藏觀念模型 68 5.2 相關詞彙抽取 69 5.2.1 以總體分析法為基礎之相關詞彙抽取 69 5.2.2 以潛藏觀念模型為基礎之相關詞彙抽取 72 5.3 相關詞彙抽取實驗 74 5.3.1 實驗語料與測試環境介紹 74 5.3.2 基礎相關詞彙抽取實驗 75 5.3.3 基於關鍵詞之相關詞彙抽取實驗 77 5.4 本章總結 81 第六章基於潛藏語意之語音文件檢索 83 6.1 詞彙相關度量測於文件檢索之應用 83 6.2由位置特定事後機率詞圖進行詞典內詞彙抽取 85 6.3 基於潛藏語意之檢索系統架構 86 6.4 基於潛藏語意之語音文件檢索實驗 89 6.4.1 實驗環境與測試語料介紹 89 6.4.2 實驗方法與結果分析 89 6.4.2.1 基準實驗 90 6.4.2.2 距離度量方式分析實驗 91 6.4.2.3 潛藏觀念數目分析實驗 92 6.4.2.4 索引詞裁剪分析實驗 93 6.4.2.5 詞彙相關度閥值分析實驗 95 6.4.2.6 詞彙相關度權重分析實驗 98 6.4.2.7 基於關鍵詞之語音文件檢索實驗 100 6.4.2.8 基於潛藏語意與基於字面比對的語音文件檢索實驗之比較 102 6.4.2.9 語音文件檢索結果之展現 104 6.5 本章總結 105 第七章結論與展望 107 7.1 論文總結 107 7.2 未來展望 109 參考文獻 111
dc.language.iso	zh-TW
dc.subject	語音文件檢索	zh_TW
dc.subject	機率式潛藏語意分析模型	zh_TW
dc.subject	位置特定事後機率詞圖	zh_TW
dc.subject	Position Specific Posterior Lattices	en
dc.subject	Probabilistic Latent Semantic Analysis	en
dc.subject	Spoken Document Retrieval	en
dc.title	基於位置特定事後機率詞圖及潛藏語意分析之語音文件檢索	zh_TW
dc.title	Spoken Document Retrieval Based on Position Specific Posterior Lattices and Latent Semantic Analysis	en
dc.type	Thesis
dc.date.schoolyear	96-2
dc.description.degree	碩士
dc.contributor.oralexamcommittee	鄭秋豫,王小川,陳信宏,簡仁宗
dc.subject.keyword	語音文件檢索,機率式潛藏語意分析模型,位置特定事後機率詞圖,	zh_TW
dc.subject.keyword	Spoken Document Retrieval,Probabilistic Latent Semantic Analysis,Position Specific Posterior Lattices,	en
dc.relation.page	112
dc.rights.note	有償授權
dc.date.accepted	2008-07-24
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	資訊工程學研究所	zh_TW
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
ntu-97-1.pdf 未授權公開取用	1.86 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。