使用深度強化學習技術與可訓練模擬使用者之互動式語音數位內容檢索

Pei-Hung Chung; 鍾佩宏

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/70589

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	李宏毅
dc.contributor.author	Pei-Hung Chung	en
dc.contributor.author	鍾佩宏	zh_TW
dc.date.accessioned	2021-06-17T04:31:53Z	-
dc.date.available	2018-08-16
dc.date.copyright	2018-08-16
dc.date.issued	2018
dc.date.submitted	2018-08-13
dc.identifier.citation	[1] Ziyu Wang, Tom Schaul, Matteo Hessel, Hado Van Hasselt, Marc Lanctot, and Nando De Freitas, “Dueling network architectures for deep reinforcement learn- ing,” arXiv preprint arXiv:1511.06581, 2015. [2] merchdope.com, “37 mind blowing youtube facts, figures and statistics – 2018,” http://https://merchdope.com/youtube-statistics/, Accessed June 6, 2018. [3] Ciprian Chelba, Timothy J Hazen, and Murat Saraclar, “Retrieval and browsing of spoken content,” IEEE Signal Processing Magazine, vol. 25, no. 3, 2008. [4] Lin-shan Lee and Berlin Chen, “Spoken document understanding and organization,” IEEE Signal Processing Magazine, vol. 22, no. 5, pp. 42–60, 2005. [5] Tsung-Hsien Wen, Hung-Yi Lee, and Lin-Shan Lee, “Interactive spoken content retrieval with different types of actions optimized by a markov decision process,” in Thirteenth Annual Conference of the International Speech Communication Associa- tion, 2012. [6] Tsung-HsienWen,Hung-YiLee,Pei-haoSu,andLin-ShanLee,“Interactivespoken content retrieval by extended query model and continuous state space markov deci- sion process,” in Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on. IEEE, 2013, pp. 8510–8514. [7] Yen-Chen Wu, Tzu-Hsiang Lin, Yang-De Chen, Hung-Yi Lee, and Lin-Shan Lee, “Interactive spoken content retrieval by deep reinforcement learning,” arXiv preprint arXiv:1609.05234, 2016. [8] Harm De Vries, Florian Strub, Sarath Chandar, Olivier Pietquin, Hugo Larochelle, and Aaron Courville, “Guesswhat?! visual object discovery through multi-modal dialogue,” in Proc. of CVPR, 2017. [9] Mike Lewis, Denis Yarats, Yann N Dauphin, Devi Parikh, and Dhruv Batra, “Deal or no deal? end-to-end learning for negotiation dialogues,” arXiv preprint arXiv:1706.05125, 2017. [10] Murat Saraclar and Richard Sproat, “Lattice-based search for spoken utterance re- trieval,” in Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics: HLT-NAACL 2004, 2004. [11] Jerome R Bellegarda, “Statistical language model adaptation: review and perspec- tives,” Speech communication, vol. 42, no. 1, pp. 93–108, 2004. [12] Peter F Brown, Peter V Desouza, Robert L Mercer, Vincent J Della Pietra, and Jenifer C Lai, “Class-based n-gram models of natural language,” Computational linguistics, vol. 18, no. 4, pp. 467–479, 1992. [13] William A Gale and Geoffrey Sampson, “Good-turing frequency estimation without tears,” Journal of quantitative linguistics, vol. 2, no. 3, pp. 217–237, 1995. [14] Frankie James, “Modified kneser-ney smoothing of n-gram models,” Research Institute for Advanced Computer Science, Tech. Rep. 00.07, 2000. [15] Hung-Yi Lee, Tsung-Hsien Wen, and Lin-Shan Lee, “Improved semantic retrieval of spoken content by language models enhanced with acoustic similarity graph,” in Spoken Language Technology Workshop (SLT), 2012 IEEE. IEEE, 2012, pp. 182– 187. [16] Hung-yi Lee and Lin-shan Lee, “Enhanced spoken term detection using support vector machines and weighted pseudo examples,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 21, no. 6, pp. 1272–1284, 2013. [17] Hung-Yi Lee, Yun-Nung Chen, and Lin-Shan Lee, “Improved speech summariza- tion and spoken term detection with graphical analysis of utterance similarities,” Proc. APSIPA, 2011. [18] Tee Kiah Chia, Khe Chai Sim, Haizhou Li, and Hwee Tou Ng, “Statistical lattice- based spoken document retrieval,” ACM Transactions on Information Systems (TOIS), vol. 28, no. 1, pp. 2, 2010. [19] John Lafferty and Chengxiang Zhai, “Document language models, query models, and risk minimization for information retrieval,” in Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval. ACM, 2001, pp. 111–119. [20] John S Garofolo, Cedric GP Auzanne, and Ellen M Voorhees, “The trec spoken document retrieval track: A success story,” in Content-Based Multimedia Information Access-Volume 1. LE CENTRE DE HAUTES ETUDES INTERNATIONALES D’INFORMATIQUE DOCUMENTAIRE, 2000, pp. 1–20. [21] Ian Ruthven and Mounia Lalmas, “A survey on the use of relevance feedback for information access systems,” The Knowledge Engineering Review, vol. 18, no. 2, pp. 95–145, 2003. [22] Gerard Salton, Anita Wong, and Chung-Shu Yang, “A vector space model for auto- matic indexing,” Communications of the ACM, vol. 18, no. 11, pp. 613–620, 1975. [23] Chengxiang Zhai and John Lafferty, “Model-based feedback in the language mod- eling approach to information retrieval,” in Proceedings of the tenth international conference on Information and knowledge management. ACM, 2001, pp. 403–410. [24] Stephen E Robertson and K Sparck Jones, “Relevance weighting of search terms,” Journal of the Association for Information Science and Technology, vol. 27, no. 3, pp. 129–146, 1976. [25] Simon Tong and Edward Chang, “Support vector machine active learning for image retrieval,” in Proceedings of the ninth ACM international conference on Multimedia. ACM, 2001, pp. 107–118. [26] King-Shy Goh, Edward Y Chang, and Wei-Cheng Lai, “Multimodal concept- dependent active learning for image retrieval,” in Proceedings of the 12th annual ACM international conference on Multimedia. ACM, 2004, pp. 564–571. [27] JingruiHe,HanghangTong,MingjingLi,Hong-JiangZhang,andChangshuiZhang, “Mean version space: a new active learning method for content-based image retrieval,” in Proceedings of the 6th ACM SIGMM international workshop on Multi- media information retrieval. ACM, 2004, pp. 15–22. [28] Diane Kelly and Jaime Teevan, “Implicit feedback for inferring user preference: a bibliography,” in Acm Sigir Forum. ACM, 2003, vol. 37, pp. 18–28. [29] Oren Kurland, Lillian Lee, and Carmel Domshlak, “Better than the real thing?: iterative pseudo-query processing using cluster-based language models,” in Pro- ceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval. ACM, 2005, pp. 19–26. [30] Jinxi Xu and W Bruce Croft, “Query expansion using local and global document analysis,” in Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval. ACM, 1996, pp. 4–11. [31] Shipeng Yu, Deng Cai, Ji-Rong Wen, and Wei-Ying Ma, “Improving pseudo- relevance feedback in web information retrieval using web page segmentation,” in Proceedings of the 12th international conference on World Wide Web. ACM, 2003, pp. 11–18. [32] TetsuyaSakai,ToshihikoManabe,andMakotoKoyama,“Flexiblepseudo-relevance feedback via selective sampling,” ACM Transactions on Asian Language Informa- tion Processing (TALIP), vol. 4, no. 2, pp. 111–135, 2005. [33] Guihong Cao, Jian-Yun Nie, Jianfeng Gao, and Stephen Robertson, “Selecting good expansion terms for pseudo-relevance feedback,” in Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval. ACM, 2008, pp. 243–250. [34] Kyung Soon Lee, W Bruce Croft, and James Allan, “A cluster-based resampling method for pseudo-relevance feedback,” in Proceedings of the 31st annual inter- national ACM SIGIR conference on Research and development in information re- trieval. ACM, 2008, pp. 235–242. [35] Yuanhua Lv and ChengXiang Zhai, “A comparative study of methods for estimating query language models with pseudo feedback,” in Proceedings of the 18th ACM con- ference on Information and knowledge management. ACM, 2009, pp. 1895–1898. [36] Yuanhua Lv and ChengXiang Zhai, “Positional relevance model for pseudo- relevance feedback,” in Proceedings of the 33rd international ACM SIGIR confer- ence on Research and development in information retrieval. ACM, 2010, pp. 579– 586. [37] Richard S Sutton and Andrew G Barto, Reinforcement learning: An introduction, vol. 1, MIT press Cambridge, 1998. [38] William Beranek, “Ronald a. howard “dynamic programming and markov pro- cesses,”,” 1961. [39] Stuart Dreyfus, “Richard bellman on the birth of dynamic programming,” Opera- tions Research, vol. 50, no. 1, pp. 48–51, 2002. [40] Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov, “Dropout: A simple way to prevent neural networks from overfit- ting,” The Journal of Machine Learning Research, vol. 15, no. 1, pp. 1929–1958, 2014. [41] YinanZhangandChengxiangZhai,“Informationretrievalascardplaying:Aformal model for optimizing interactive retrieval interface,” in Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 2015, pp. 685–694. [42] David Robins, “Interactive information retrieval: Context and basic notions,” In- forming Science, vol. 3, no. 2, pp. 57–62, 2000. [43] Ian Ruthven, “Interactive information retrieval,” Annual review of information sci- ence and technology, vol. 42, no. 1, pp. 43–91, 2008. [44] Jiyun Luo, Sicong Zhang, and Hui Yang, “Win-win search: Dual-agent stochas- tic game in session search,” in Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval. ACM, 2014, pp. 587–596. [45] Xiaoran Jin, Marc Sloan, and Jun Wang, “Interactive exploratory search for multi page search results,” in Proceedings of the 22nd international conference on World Wide Web. ACM, 2013, pp. 655–666. [46] Teruhisa Misu and Tatsuya Kawahara, “Bayes risk-based dialogue management for document retrieval system with speech interface,” Speech Communication, vol. 52, no. 1, pp. 61–71, 2010. [47] Teruhisa Misu and Tatsuya Kawahara, “Speech-based interactive information guid- ance system using question-answering technique,” in Acoustics, Speech and Signal Processing, 2007. ICASSP 2007. IEEE International Conference on. IEEE, 2007, vol. 4, pp. IV–145. [48] Jingjing Liu, Scott Cyphers, Panupong Pasupat, Ian McGraw, and James Glass, “A conversational movie search system based on conditional random fields,” in Thir- teenth Annual Conference of the International Speech Communication Association, 2012. [49] IanMcGraw,ScottCyphers,PanupongPasupat,JingjingLiu,andJamesGlass,“Au- tomating crowd-supervised learning for spoken language systems,” in Thirteenth Annual Conference of the International Speech Communication Association, 2012. [50] Stephanie Seneff and Joseph Polifroni, “Dialogue management in the mercury flight reservation system,” in Proceedings of the 2000 ANLP/NAACL Workshop on Con- versational systems-Volume 3. Association for Computational Linguistics, 2000, pp. 11–16. [51] Lori Lamel, Samir Bennacef, Jean-Luc Gauvain, Herve ́ Dartigues, and Jean-Noe ̈l Temem, “User evaluation of the mask kiosk,” Speech Communication, vol. 38, no. 1-2, pp. 131–139, 2002. [52] Diane J Litman and Scott Silliman, “Itspoke: An intelligent tutoring spoken di- alogue system,” in Demonstration papers at HLT-NAACL 2004. Association for Computational Linguistics, 2004, pp. 5–8. [53] Esther Levin, Roberto Pieraccini, and Wieland Eckert, “A stochastic model of human-machine interaction for learning dialog strategies,” IEEE Transactions on speech and audio processing, vol. 8, no. 1, pp. 11–23, 2000. [54] JasonDWilliamsandSteveYoung,“Partiallyobservablemarkovdecisionprocesses for spoken dialog systems,” Computer Speech & Language, vol. 21, no. 2, pp. 393– 422, 2007. [55] Tie-Yan Liu et al., “Learning to rank for information retrieval,” Foundations and Trends⃝R in Information Retrieval, vol. 3, no. 3, pp. 225–331, 2009. [56] Xuanhui Wang, Hui Fang, and ChengXiang Zhai, “A study of methods for negative relevance feedback,” in Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval. ACM, 2008, pp. 219–226. [57] Maryam Karimzadehgan and ChengXiang Zhai, “Improving retrieval accuracy of difficult queries through generalizing negative document language models,” in Pro- ceedings of the 20th ACM international conference on Information and knowledge management. ACM, 2011, pp. 27–36. [58] Hung-Yi Lee and Lin-Shan Lee, “Improved semantic retrieval of spoken content by document/query expansion with random walk over acoustic similarity graphs,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 22, no. 1, pp. 80–94, 2014. [59] Tao Tao and ChengXiang Zhai, “Regularized estimation of mixture models for robust pseudo-relevance feedback,” in Proceedings of the 29th annual interna- tional ACM SIGIR conference on Research and development in information re- trieval. ACM, 2006, pp. 162–169. [60] Ben He and Iadh Ounis, “Query performance prediction,” Information Systems, vol. 31, no. 7, pp. 585–594, 2006. [61] Ying Zhao, Falk Scholer, and Yohannes Tsegay, “Effective pre-retrieval query per- formance prediction using similarity and variability evidence,” in European Confer- ence on Information Retrieval. Springer, 2008, pp. 52–64. [62] Yun Zhou and W Bruce Croft, “Query performance prediction in web search envi- ronments,” in Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval. ACM, 2007, pp. 543–550. [63] Pang-Ning Tan, Michael Steinbach, and Vipin Kumar, Introduction to Data Mining, (First Edition), Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA, 2005. [64] Fei Liu, Feifan Liu, and Yang Liu, “Automatic keyword extraction for the meeting corpus using supervised approach and bigram expansion,” in Spoken Language Technology Workshop, 2008. SLT 2008. IEEE. IEEE, 2008, pp. 181–184. [65] Fei Liu, Feifan Liu, and Yang Liu, “A supervised framework for keyword extraction from meeting transcripts,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 19, no. 3, pp. 538–548, 2011. [66] Feifan Liu, Deana Pennell, Fei Liu, and Yang Liu, “Unsupervised approaches for automatic keyword extraction using meeting transcripts,” in Proceedings of human language technologies: The 2009 annual conference of the North American chap- ter of the association for computational linguistics. Association for Computational Linguistics, 2009, pp. 620–628. [67] Timothy J Hazen, “Latent topic modeling for audio corpus summarization,” in Twelfth Annual Conference of the International Speech Communication Association, 2011. [68] Yun-Nung Chen, Yu Huang, Sheng-Yi Kong, and Lin-Shan Lee, “Automatic key term extraction from spoken course lectures using branching entropy and prosod- ic/semantic features,” in Spoken Language Technology Workshop (SLT), 2010 IEEE. IEEE, 2010, pp. 265–270. [69] Yun-Nung Chen, Yu Huang, Hung-Yi Lee, and Lin-Shan Lee, “Unsupervised two- stage keyword extraction from spoken documents by topic coherence and support vector machine,” in Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on. IEEE, 2012, pp. 5041–5044. [70] Thomas Hofmann, “Probabilistic latent semantic analysis,” in Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence. Morgan Kaufmann Publishers Inc., 1999, pp. 289–296. [71] David M Blei, Andrew Y Ng, and Michael I Jordan, “Latent dirichlet allocation,” Journal of machine Learning research, vol. 3, no. Jan, pp. 993–1022, 2003. [72] David M Blei and John D Lafferty, “Dynamic topic models,” in Proceedings of the 23rd international conference on Machine learning. ACM, 2006, pp. 113–120. [73] Michal Rosen-Zvi, Thomas Griffiths, Mark Steyvers, and Padhraic Smyth, “The author-topic model for authors and documents,” in Proceedings of the 20th confer- ence on Uncertainty in artificial intelligence. AUAI Press, 2004, pp. 487–494. [74] Geoffrey E Hinton and Ruslan R Salakhutdinov, “Reducing the dimensionality of data with neural networks,” science, vol. 313, no. 5786, pp. 504–507, 2006. [75] Yoshua Bengio et al., “Learning deep architectures for ai,” Foundations and trends⃝R in Machine Learning, vol. 2, no. 1, pp. 1–127, 2009. [76] Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A Rusu, Joel Veness, Marc G Bellemare, Alex Graves, Martin Riedmiller, Andreas K Fidjeland, Georg Ostrovski, et al., “Human-level control through deep reinforcement learning,” Na- ture, vol. 518, no. 7540, pp. 529, 2015. [77] Long-Ji Lin, “Reinforcement learning for robots using neural networks,” Tech. Rep., Carnegie-Mellon Univ Pittsburgh PA School of Computer Science, 1993. [78] Hado Van Hasselt, Arthur Guez, and David Silver, “Deep reinforcement learning with double q-learning.,” in AAAI, 2016, vol. 16, pp. 2094–2100. [79] Hado V Hasselt, “Double q-learning,” in Advances in Neural Information Process- ing Systems, 2010, pp. 2613–2621. [80] Yi-Cheng Pan, Hung-Yi Lee, and Lin-Shan Lee, “Interactive spoken document re- trieval with suggested key terms ranked by a markov decision process,” IEEE Trans- actions on Audio, Speech, and Language Processing, vol. 20, no. 2, pp. 632–645, 2012. [81] Bing Liu and Ian Lane, “Iterative policy learning in end-to-end trainable task- oriented neural dialog models,” arXiv preprint arXiv:1709.06136, 2017. [82] Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, and Martin Riedmiller, “Playing atari with deep re- inforcement learning,” arXiv preprint arXiv:1312.5602, 2013. [83] Matteo Hessel, Joseph Modayil, Hado Van Hasselt, Tom Schaul, Georg Ostrovski, Will Dabney, Dan Horgan, Bilal Piot, Mohammad Azar, and David Silver, “Rain- bow: Combining improvements in deep reinforcement learning,” arXiv preprint arXiv:1710.02298, 2017. [84] Hsin-Min Wang, Berlin Chen, Jen-Wei Kuo, and Shih-Sian Cheng, “Matbn: A mandarin chinese broadcast news corpus,” International Journal of Computational Linguistics & Chinese Language Processing, Volume 10, Number 2, June 2005: Special Issue on Annotated Speech Corpora, vol. 10, no. 2, pp. 219–236, 2005. [85] Senthilkumar Chandramohan, Matthieu Geist, and Olivier Pietquin, “Optimizing spoken dialogue management with fitted value iteration,” in Eleventh Annual Con- ference of the International Speech Communication Association, 2010. [86] Richard Bellman and Stuart Dreyfus, “Functional approximations and dynamic pro- gramming,” Mathematical Tables and Other Aids to Computation, pp. 247–251, 1959. [87] Re ́mi Munos and Csaba Szepesva ́ri, “Finite-time bounds for fitted value iteration,” Journal of Machine Learning Research, vol. 9, no. May, pp. 815–857, 2008. [88] Csaba Szepesva ́ri and Re ́mi Munos, “Finite time bounds for sampling based fitted value iteration,” in Proceedings of the 22nd international conference on Machine learning. ACM, 2005, pp. 880–887. [89] Sebastian Thrun and Anton Schwartz, “Issues in using function approximation for reinforcement learning,” in Proceedings of the 1993 Connectionist Models Summer School Hillsdale, NJ. Lawrence Erlbaum, 1993. [90] Andra ́sAntos,CsabaSzepesva ́ri,andRe ́miMunos,“Fittedq-iterationincontinuous action-space mdps,” in Advances in neural information processing systems, 2008, pp. 9–16. [91] Chia-Hsuan Li, Szu-Lin Wu, Chi-Liang Liu, and Hung-yi Lee, “Spoken squad: A study of mitigating the impact of speech recognition errors on listening comprehen- sion,” arXiv preprint arXiv:1804.00320, 2018. [92] Pranav Rajpurkar, Jian Zhang, Konstantin Lopyrev, and Percy Liang, “Squad: 100,000+ questions for machine comprehension of text,” arXiv preprint arXiv:1606.05250, 2016. [93] Willie Walker, Paul Lamere, Philip Kwok, Bhiksha Raj, Rita Singh, Evandro Gou- vea, Peter Wolf, and Joe Woelfel, “Sphinx-4: A flexible open source framework for speech recognition,” 2004. [94] Volodymyr Mnih, Adria Puigdomenech Badia, Mehdi Mirza, Alex Graves, Timo- thy Lillicrap, Tim Harley, David Silver, and Koray Kavukcuoglu, “Asynchronous methods for deep reinforcement learning,” in International Conference on Machine Learning, 2016, pp. 1928–1937. [95] Marc G Bellemare, Will Dabney, and Re ́mi Munos, “A distributional perspective on reinforcement learning,” arXiv preprint arXiv:1707.06887, 2017. [96] Maruan Al-Shedivat, Trapit Bansal, Yuri Burda, Ilya Sutskever, Igor Mordatch, and Pieter Abbeel, “Continuous adaptation via meta-learning in nonstationary and com- petitive environments,” arXiv preprint arXiv:1710.03641, 2017.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/70589	-
dc.description.abstract	本論文之主軸在探討語音數位內容之互動式檢索 (Interactive Retrieval of Spoken Content) 與針對互動式檢索系統中的模擬使用者做改進。由於數位語音內容難以快速瀏覽，且語音辨識的錯誤造成高度的不確定性，所以使用者與系統的互動對語音數位內容檢索系統 (Spoken Content Retrieval System) 有關鍵性的影響。在互動式檢索的系統中，系統會選擇不同的行動與使用者互動來得到更多資訊，所以如何讓系統根據目前的狀態選擇最有效率的行動是極為重要的。在前人的研究中，互動式檢索系統使用深度Q-類神經網路 (Deep-Q Network) 的演算法訓練馬可夫決策模型 (Markov Decision Process, MDP) ，並使用基於經驗法則訂定規則 (Rule-based) 的模擬使用者 (User Simulator)。然而，建立一個可信賴且貼近真實使用者行為的模擬使用者是很大的挑戰。本論文提出可與互動式檢索系統同步訓練的模擬使用者，來增進互動式語音數位內容檢索系統的效能，取代基於規則的模擬使用者。實驗顯示，可與檢索系統同步訓練的模擬使用者比起基於規則的模擬使用者不但得到更大獎勵，在真人評估 (Human Evaluation) 的測驗中也更像真實使用者。	zh_TW
dc.description.abstract	User-machine interaction is crucial for information retrieval, especially for spoken con- tent retrieval, because spoken content is difficult to browse, and speech recognition has a high degree of uncertainty. In interactive retrieval, the machine takes different actions to interact with the user to obtain better retrieval results; here it is critical to select the most efficient action. In previous work, deep Q-learning techniques were proposed to train an interactive retrieval system but rely on a hand-crafted user simulator; building a reliable user simulator is difficult. In this thesis, we further improve the interactive spoken content retrieval framework by proposing a learnable user simulator which is jointly trained with interactive retrieval system, making the hand-crafted user simulator unnecessary. The ex- perimental results show that the learned simulated users not only achieve larger rewards than the hand-crafted ones but act more like real users.	en
dc.description.provenance	Made available in DSpace on 2021-06-17T04:31:53Z (GMT). No. of bitstreams: 1 ntu-107-R05942048-1.pdf: 3982921 bytes, checksum: 029b068bdd6f92ee79deac2d9a2b0439 (MD5) Previous issue date: 2018	en
dc.description.tableofcontents	口試委員會審定書.................................. i 誌謝.......................................... ii 中文摘要....................................... iv 英文摘要....................................... v 一、導論....................................... 1 1.1 背景介紹.................................. 1 1.2 研究動機與目的.............................. 3 1.3 相關研究.................................. 5 1.4 主要貢獻.................................. 6 1.5 章節安排.................................. 7 二、背景知識 .................................... 8 2.1 語音數位內容檢索............................. 8 2.1.1 口述語彙偵測 ........................... 8 2.1.2 語意檢索.............................. 9 2.1.3 語音數位內容檢索系統的基本架構 ............... 9   2.2 自動語音辨識系統............................. 10 2.2.1 特徵抽取.............................. 10 2.2.2 聲學模型.............................. 11 2.2.3 語言模型.............................. 12 2.2.4 詞典 ................................ 14 2.2.5 辭典外詞彙(OutofVocabulary，OOV) . . . . . . . . . . . . . 14 2.2.6 詞圖與N最佳序列......................... 15   2.3 資訊檢索系統 ............................... 17 2.3.1 語音文件搜尋 ........................... 17 2.3.2 語音文件索引 ........................... 19 2.3.3 空間向量檢索模型(VSM) .................... 21 2.3.4 語言模型檢索模型(LM) ..................... 21 2.3.5 資訊檢索評估機制 ........................ 23   2.4 互動式系統................................. 25 2.4.1 相關回饋.............................. 26 2.4.2 馬可夫決策模型.......................... 28 2.4.3 人工類神經網路.......................... 31   2.5 本章總結.................................. 33   三、互動式語音數位內容檢索系統 ........................ 34 3.1 簡介..................................... 34 3.1.1 前人研究與改進動機 ....................... 35 3.1.2 系統組成.............................. 37 3.2 檢索模塊.................................. 40 3.2.1 檢索架構.............................. 41 3.2.2 使用者回饋 ............................ 42  3.3 搜尋結果的特徵抽取 ........................... 42 3.3.1 人類知識特徵值.......................... 43 3.3.2 原始相關度分數.......................... 43  3.4 基於深度強化學習的對話管理者..................... 43 3.4.1 行動(Action)............................ 44 3.4.2 獎勵(Reward)與回報(Return) .................. 47   3.5 強化學習演算法 .............................. 48 3.5.1 深度強化學習演算法簡介 .................... 48 3.5.2 深度Q-類神經網路(DeepQ-Network,DQN) . . . . . . . . . . 50 3.5.3 深度雙Q-類神經網路(DoubleDQN)............... 52 3.5.4 深度競爭式Q-類神經網路(DuelingDQN) . . . . . . . . . . . . 53 四、使用可同步學習模擬使用者之互動式語音數位內容檢索系統 . . . . . . . 55 4.1 簡介..................................... 55 4.1.1 前人研究與改進動機 ....................... 56 4.1.2 系統組成.............................. 57   4.2 互動式語音數位內容檢索系統改進 ................... 58   4.3 基於深度強化學習的模擬使用者..................... 60 4.3.1 狀態(State)............................. 60 4.3.2 行動(Action)............................ 60 4.3.3 獎勵(Reward)與回報(Return) .................. 62 4.3.4 訓練方法.............................. 63   4.4 本章總結.................................. 63 五、實驗與分析 ................................... 64 5.1 簡介..................................... 64   5.2 互動式語音數位內容檢索實驗-中文廣播新聞語料庫 . . . . . . . . . . 64 5.2.1 資料集簡介 ............................ 64 5.2.2 實驗設定.............................. 65 5.2.3 實驗結果.............................. 67 5.2.4 真實使用者調查.......................... 75 5.2.5 使用者行為不一致的訓練與測試 ................ 80 5.2.6 系統行為分析 ........................... 81   5.3 互動式語音數位內容檢索實驗-語音史丹佛問答資料集 . . . . . . . . 84 5.3.1 資料集簡介 ............................ 86 5.3.2 實驗設定.............................. 86 5.3.3 實驗結果.............................. 87   5.4 使用可同步學習模擬使用者之檢索系統................. 88 5.4.1 實驗設定.............................. 88 5.4.2 實驗結果.............................. 89 5.4.3 人工評估.............................. 92 5.5 本篇總結.................................. 93 六、結論與展望 ................................... 95 6.1 結論與主要貢獻 .............................. 95 6.2 未來展望.................................. 96 6.2.1 嘗試更適合互動式系統的深度強化學習模型.......... 96 6.2.2 設計更有效率的同步更新策略.................. 97 6.2.3 狀態與行動的拓展 ........................ 98 6.2.4 設計自然語言使用者互動介面.................. 98 參考文獻....................................... 99
dc.language.iso	zh-TW
dc.title	使用深度強化學習技術與可訓練模擬使用者之互動式語音數位內容檢索	zh_TW
dc.title	Interactive Spoken Content Retrieval with Deep Reinforcement Learning and Trainable User Simulator	en
dc.type	Thesis
dc.date.schoolyear	106-2
dc.description.degree	碩士
dc.contributor.oralexamcommittee	李琳山,王小川,鄭秋豫,陳信宏
dc.subject.keyword	深度強化學習,語音數位內容檢索,互動式資訊檢索,	zh_TW
dc.subject.keyword	Deep Reinforcement Learning,Spoken Content Retrieval,Interactive Information Retrieval,	en
dc.relation.page	113
dc.identifier.doi	10.6342/NTU201802898
dc.rights.note	有償授權
dc.date.accepted	2018-08-13
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	電信工程學研究所	zh_TW
顯示於系所單位：	電信工程學研究所

文件中的檔案：

檔案	大小	格式
ntu-107-1.pdf 目前未授權公開取用	3.89 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。