用機器學習整合索引資訊之中文語音文件檢索

Chia-Ming Yang; 楊家銘

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/43698

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	李琳山(Lin-Shan Lee)
dc.contributor.author	Chia-Ming Yang	en
dc.contributor.author	楊家銘	zh_TW
dc.date.accessioned	2021-06-15T02:26:16Z	-
dc.date.available	2011-08-18
dc.date.copyright	2011-08-18
dc.date.issued	2011
dc.date.submitted	2011-08-16
dc.identifier.citation	[1] Vannevar Bush, “As we may think,” in The Atlantic Monthly, 1945. [2] Sparck, G. J. F. Jones, J. T. Foote, and S. J. Young, “Experiments in spoken document retrieval,” Information Processing & Management, vol. 32, no. 4, pp. 399–417, July 1996. [3] J. Garofolo, G. Auzanne, and E. Voorhees, “The TREC Spoken Document Retrieval Track: A Success Story,” 2000. [4] NISO National Information Standards Organization, UnderstandingMetadata, NISO press, 2004. [5] Berlin. Chen, Jen-Wei. Kuo, Yao-Min. Huang, and Hsin min. Wang, “Statistical chinese spoken document retrieval using latent topical information,” Interspeech, 2004. [6] Amit. Singhal, “Modern information retrieval: A brief overview,” IEEE, 2004. [7] van Rijsbergen Cornelis Joost., Information Retrieval, 2nd edition, Butterworths., 1979. [8] Gerard. Salton, Automatic Information Organization and Retrieval, McGraw Hill Text, 1968. [9] J. Xu, Y. Cao, H. Li, and Y. Huang, “Cost-sensitive learning of svm for ranking,” Proc. ECML, 2006. [10] S. E. Roberson and K. Sparck Jones, “Relevance weighting of search terms,” Journal of American Society for Information Sciences, pp. 129–146, 1976. [11] R. Nallapati, “Discriminant models for information retrieval,” SIGIR ’04: Proceeding of the 27th annual international ACM SIGIR conference in Research and development in information retrieval, 2004. [12] H. Drucker, D. Wu, and V. N. Vapnik, “Support vector machines fpr spam categorization,” IEEE Transactions on Neural Networks, 1999. [13] Adam. Berger, “Statistical machine learning for information retrieval,” CMU Phd. Thesis, 2001. [14] R. Herbrich, T. Graepel, and K. Obermayer, “Large margin rank boundaries for ordinal regression,” Advances in Large Margin Classifiers, 2000. [15] Luc De Raedt and Stefan Wrobel, Eds., Learning to rank using gradient descent. ACM, 2005. [16] Yunbo Cao, Jun Xu, Tie-Yan Liu, Hang Li, Yalou Huang, and Hsiao-Wuen Hon, “Adapting ranking svm to document retrieval,” in Proceedings of the 29th annual ACM SIGIR conference, Seattle, Washington, USA, 2006, pp. 186–193, ACM. [17] Christopher J. C. Burges, Robert Ragno, and Quoc V. Le, “Learning to Rank with Nonsmooth Cost Functions,” in NIPS, Bernhard Sch‥olkopf, John C. Platt, Thomas Hoffman, Bernhard Sch‥olkopf, John C. Platt, and Thomas Hoffman, Eds. 2006, pp. 193–200, MIT Press. [18] Jun Xu and Hang Li, “AdaRank: a boosting algorithm for information retrieval,” in Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, New York, NY, USA, 2007, SIGIR ’07, pp. 391–398, ACM. [19] Yisong Yue, Thomas Finley, Filip Radlinski, and Thorsten Joachims, “A support vector method for optimizing average precision,” in Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, New York, NY, USA, 2007, SIGIR ’07, pp. 271–278, ACM. [20] Zhe Cao, Tao Qin, Tie-Yan Liu, Ming-Feng Tsai, and Hang Li, “Learning to rank: from pairwise approach to listwise approach,” in ICML ’07: Proceedings of the 24th international conference on Machine learning, New York, NY, USA, 2007, pp. 129–136, ACM. [21] Ciprian Chelba, “Spoken document retrieval and browsing,” Hopkins CLSP, 2007. [22] L. Mangu, “Finding consensus in speech recognition: word error minimization and other applications of confusion networks,” Computer Speech & Language, vol. 14, no. 4, pp. 373–400, Oct. 2000. [23] Ciprian Chelba and Alex Acero, “Position specific posterior lattices for indexing speech,” in Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, Stroudsburg, PA, USA, 2005, ACL ’05, pp. 443–450, Association for Computational Linguistics. [24] L. R. Rabiner, “A tutorial on hidden Markov models and selected applications in speech recognition,” Proceedings of the IEEE, vol. 77, no. 2, pp. 257–286, Feb. 1989. [25] the Cambridge University Engineering Department (CUED), “Htk,” http://htk.eng.cam.ac.uk/. [26] SRI International, “Srilm,” http://www.speech.sri.com/projects/srilm/. [27] Yi-CHeng. Pan, “One-pass and word-graph-based search algorithms for large vocabulary continuous mandarin speech recognition,” Master Thesis in National Taiwan University, 2002. [28] Hung ling. Chang, “Spoken document retrieval based on position specific posterior lattices adn latent semantic analysis,” Master Thesis in National Taiwan University, 2008. [29] Chao hong. Mong, Hung yi. Lee, and Lin shan Lee, “Imporved lattice-based spoken document retrieval by directly learning from hte evaluation measure,” ICASSP, 2009. [30] Jun Xu, Tie Y. Liu, Min Lu, Hang Li, and Wei Y. Ma, “Directly optimizing evaluation measures in learning to rank,” in SIGIR ’08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, New York, NY, USA, 2008, pp. 107–114, ACM.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/43698	-
dc.description.abstract	語音文件檢索在資訊爆炸的多媒體時代日益重要。大部分的語音文件檢索的技術包含兩大步驟，一是自動語音辨識技術，二是使用辨識後產生的索引資訊進行檢索。第一個步驟面對的是可能的高辨識錯誤率，會影響產生的語音文件索引所攜帶資訊的正確性；第二個步驟就是如何充分使用這些索引所帶的資訊並將之發揮到極致。本論文所研究的主題方向屬於上述第二部份，考慮如何將中文語音中不同語言單位(例如：詞(Word)、字(Character)、音節(Syllable)、聲韻母(Initial-Final)等...) 所產生的索引資訊，透過排序學習(Learning to Rank)的方法整合起來。本論文共研究了兩種排序學習(Learning to Rank)的方法︰調適排序(AdaRank)及針對平均準確率的支撐向量機(Support Vector Machine for Optimizing Mean Average Precision, SVM-map)。實驗結果顯示，使用針對平均準確率的支撐向量機的結果是比較好的，比起調適排序，最佳的平均準確率均值進步是4.70%；比起已知個別檢索效能最佳(Oracle)的索引，綜合查詢指令進步了8.67%，其中辭典內查詢詞彙的部份進步了6.30%，而辭典外查詢詞彙效果最為明顯，有約11.63%的直接進步。這些實驗結果也驗證，使用不同語言單位所產生的語音文件索引，透過排序學習找到適當的對應權重，予以加成，可以使得語音文件檢索的效能以及強健性(Robustness)獲得更進一步的提昇。	zh_TW
dc.description.provenance	Made available in DSpace on 2021-06-15T02:26:16Z (GMT). No. of bitstreams: 1 ntu-100-R98922038-1.pdf: 2422474 bytes, checksum: c915f2b6b1ce8e0012abd9e408bfafbe (MD5) Previous issue date: 2011	en
dc.description.tableofcontents	口試委員會審定書. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . i 中文摘要. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii 一、導論. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 研究動機. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 本論文研究方向以及相關研究. . . . . . . . . . . . . . . . . . . . . . 2 1.3 本論文主要的研究方法及成果. . . . . . . . . . . . . . . . . . . . . . 5 1.4 本論文章節安排. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 二、背景知識. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.1 資訊檢索簡介. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.1.1 匹配策略(Matching Strategy) . . . . . . . . . . . . . . . . . . . 7 2.1.2 學習能力. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.2 語音文件檢索介紹. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.2.1 最佳序列(One-Best Sequence) . . . . . . . . . . . . . . . . . . 20 2.2.2 詞圖(Word Graph / Lattice) . . . . . . . . . . . . . . . . . . . . 21 2.2.3 混淆網路(Confusion Network) . . . . . . . . . . . . . . . . . . 23 2.2.4 位置特定事後機率詞圖(Position Specific Posterior Lattices) . . 26 2.3 檢索評估機制. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 2.4 本章總結. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 三、語音文件索引. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 3.1 實驗架構. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 3.1.1 使用工具. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 3.1.2 辭典、聲學模型、語言模型. . . . . . . . . . . . . . . . . . . 32 3.1.3 實驗測試語料. . . . . . . . . . . . . . . . . . . . . . . . . . . 33 3.1.4 查詢指令. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 3.1.5 檢索模式以及評估標準. . . . . . . . . . . . . . . . . . . . . . 34 3.2 初步實驗. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 3.2.1 語音文件索引. . . . . . . . . . . . . . . . . . . . . . . . . . . 35 3.2.2 語音文件索引組合. . . . . . . . . . . . . . . . . . . . . . . . 36 3.2.3 產生索引使用的工具. . . . . . . . . . . . . . . . . . . . . . . 37 3.3 實驗結果. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 3.3.1 個別語音文件索引檢索效能. . . . . . . . . . . . . . . . . . . 37 3.3.2 均勻(Uniform)權重組合效能. . . . . . . . . . . . . . . . . . . 39 3.4 本章總結. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 四、機器學習方法改進. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 4.1 實驗架構. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 4.2 廣義架構. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 4.3 調適排序(AdaRank) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 4.3.1 簡介. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 4.3.2 演算法. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 4.3.3 實驗結果. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 4.3.4 權重分析. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 4.4 針對平均準確率的支撐向量機(Support Vector Machine for Optimizing Mean Average Precision, SVM-map) . . . . . . . . . . . . . . . . . . . . 51 4.4.1 簡介. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 4.4.2 找出最佳參數測試集(Development Set) . . . . . . . . . . . . . 53 4.4.3 實驗結果. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 4.4.4 不同索引類型組合比較. . . . . . . . . . . . . . . . . . . . . . 55 4.4.5 權重分析. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 4.5 本章總結. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 五、結論與展望. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 5.1 總結. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 5.2 未來展望. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 參考文獻. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
dc.language.iso	zh-TW
dc.subject	語音文件索引	zh_TW
dc.subject	語音	zh_TW
dc.subject	語音文件	zh_TW
dc.subject	語音文件檢索	zh_TW
dc.subject	機器學習	zh_TW
dc.subject	Spoken Document Retrieval Indexing	en
dc.subject	Speech	en
dc.subject	Spoken Document Retrieval	en
dc.subject	Machine Learning	en
dc.title	用機器學習整合索引資訊之中文語音文件檢索	zh_TW
dc.title	Integrating Indexing Information by Machine Learning for Chinese Spoken Document Retrieval	en
dc.type	Thesis
dc.date.schoolyear	99-2
dc.description.degree	碩士
dc.contributor.oralexamcommittee	陳信希(Hsin-Hsi Chen),洪一平(Yi-Ping Hung)
dc.subject.keyword	語音,語音文件,語音文件檢索,機器學習,語音文件索引,	zh_TW
dc.subject.keyword	Speech,Spoken Document Retrieval,Machine Learning,Spoken Document Retrieval Indexing,	en
dc.relation.page	66
dc.rights.note	有償授權
dc.date.accepted	2011-08-17
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	資訊工程學研究所	zh_TW
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
ntu-100-1.pdf 未授權公開取用	2.37 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。