個人相片檢索系統之多模式對話使用者介面

Hsiu-Wen Hsueh; 薛琇文

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/6722

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	李琳山
dc.contributor.author	Hsiu-Wen Hsueh	en
dc.contributor.author	薛琇文	zh_TW
dc.date.accessioned	2021-05-17T09:16:54Z	-
dc.date.available	2017-08-09
dc.date.available	2021-05-17T09:16:54Z	-
dc.date.copyright	2012-08-09
dc.date.issued	2012
dc.date.submitted	2012-07-30
dc.identifier.citation	[1] Teruhisa Misu, Tatsuya Kawahara, “Bayes risk-based dialogue management for document retrieval system with speech interface,” Speech Communication, vol. 52, pp. 61-71, 2010. [2] Antoine Raux, Brian Langner, Dan Bohus, Alan W Black, Maxine Eskenazi, “Let’s go public! Taking a spoken dialog system to the real world,” in Proceedings of Interspeech, 2005. [3] Jingjing Liu, Stephanie Seneff, Victor Zue, “Utilizing review summarization in a spoken recommendation system,” in Proceedings of the 11th Annual Meeting of the Special Interest Group on Discourse and Dialogue, 2010. [4] Thomas Hofmann, “Probabilistic latent semantic indexing,” in Proceedings of ACM SIGIR Conf.R&D in Informational Retrieva, 1999. [5] Marc Cavazza, Raul Santos de la Camara, Markku Turunen, Jose Relano Gil, Jaakko Hakulinen, Nigel Crook, Debora Field, “‘How was your day?’ An affective companion ECA prototype,” in Proceedings of the 11th Annual Meeting of the Special Interest Group on Discourse and Dialogue, 2010. [6] Matthew Frampton, Sandeep Sripada, Ricardo Augusto Hoffmann Bion, Stanley Peters, “Detection of time-pressure induced stress in speech via acoustic indicators,” in Proceedings of the 11th Annual Meeting of the Special Interest Group on Discourse and Dialogue, 2010. [7] Kristy Elizabeth Boyer, Eun Young Ha, Robert Phillips, Michael D. Wallis, Mladen A. Vouk, James C. Lester, “Dialogue act modeling in a complex task-oriented domain,” in Proceedings of the 11th Annual Meeting of the Special Interest Group on Discourse and Dialogue, 2010. [8] Richard S. Sutton, Andrew G. Barto, Reinforcement Learning: An Introduction, MIT Press, 1998. [9] Matthew Frampton, Oliver Lemon, “Learning more effective dialogue strategies using limited dialogue move features,” in Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the ACL, 2006. [10] Satinder Singh, Diane Litman, Michael Kearns, Marilyn Walker, “Optimization dialogue management with reinforcement learning: experiments with the NJFun system,” in Journal of Artificial Intelligence Research, vol. 16, pp.105-133, 2002. [11] Jason D. Williams, Steve Young, “Partially observable Markov decision processes for spoken dialog systems,” Computer Speech and Language, vol. 21, pp. 393-422, 2007. [12] Jason D. Williams, “Incremental partition recombination for efficient tracking of multiple dialog states,” in Proceedings of International Conference on Acoustics, Speech and Signal Processing, 2010. [13] Stephane Ross, Brahim Chaib-draa, Joelle Pineau, Bayes-Adaptive POMDPs, MIT Press, 2008. [14] R. Lopez-Cozar, A. De la Torre, J.C. Segura, A.J. Rubio, “Assessment of dialogue systems by means of a new simulation technique,” Speech Communication, vol. 40, pp. 387-407, 2003. [15] Meritxell Gonzalez, Silvia Quarteroni, Giuseppe Riccardi, Sebastian Varges, “Cooperative user models in statistical dialog simulators,” in Proceedings of the 11th Annual Meeting of the Special Interest Group on Discourse and Dialogue, 2010. [16] Jost Schatzmann, Karl Weilhammer, Matt Stuttle, Steve Young, “A survey of statistical user simulation techniques for reinforcement-learning of dialogue management strategies,” The Knowledge Engineering Review, vol. 00:0, pp.1-24, 2006. [17] Marilyn A. Walker, Diane J. Litman, Candace A. Kamm, Alicia Abella, “PARADISE: a framework for evaluating spoken dialogue agents,” in Proceedings of the 35th Annual Meeting of the Association for Computational Linguistic, 1997. [18] Zhaojun Yang, Baichuan Li, Yi Zhu, Irwin King, Gina Levow, Helen Meng, “Collaborative filtering model for user satisfaction prediction in spoken dialog system evaluation,” in Proceedings of the 2010 IEEE Workshop on Spoken Language Technology, 2010. [19] Daniel G. Bobrow, Ronald M. Kaplan, Martin Kay, Donald A. Norman, Henry Thompson, Terry Winograd, “GUS, a frame driven dialog system,” Artificial Intelligence, vol. 8, pp. 155-173, 1977. [20] Teruhisa Misu, Komei Sugiura, Kiyonori Ohtake, Chiori Hori, Hideki Kashioka, Hisashi Kawai, Satoshi Nakamura, “Modeling spoken decision making dialogue and optimization of its dialogue strategy”, in Proceedings of the 11th Annual Meeting of the Special Interest Group on Discourse and Dialogue, 2010. [21] Marilyn Walker, Donald Hindle, Jeanne Fromer, Giuseppe Di Fabbrizio, Craig Mestel, “Evaluating competing agent strategies for a voice email agent,” in Proceedings of the European Conference on Speech Communication and Technology, 1997. [22] Steve Young and et al, The HTK Book (for HTK Version 3.0), Cambridge University, 2000. [23] Yoshinori Sagisaka, Nobuyoshi Kaiki, Naoto Iwahashi, Katsuhiko Mimura, “ATR μ-Talk speech synthesis system,” in Proceedings of ICSLP, 1992. [24] Heiga Zen, Takashi Nose, Junichi Yamagishi, Shinji Sako, Takashi Masuko, Alan W. Black, Keiichi Tokuda, “The HMM-based speech synthesis system (HTS) version 2.0,” in Proceedings of the Sixth ISCA Workshop on Speech Synthesis, 2007. [25] Yi-Sheng Fu, Chia-Yu Wan, Lin-Shan Lee, “Latent semantic retrieval of personal photos with sparse user annotation by fused image/speech/text features,” in Proceedings of International Conference on Acoustics, Speech and Signal Processing, 2009. [26] Josef Sivic, Andrew Zisserman, “Video Google: a text retrieval approach to object matching in videos,” in Proceedings of the Ninth IEEE International Conference on Computer Vision, 2003. [27] Burr Settles, “Active learning literature survey,” Computer Sciences Technical Report 1648, 2009. [28] Vladimir N. Vapnik, The nature of statistical learning theory, Springer-Verlag New York, Inc., New York, NY, USA, 1995. [29] Martin T. Hagan, Howard B. Demuth, Mark H. Beale, Neural Network Design, University of Colorado Bookstore, 1996. [30] Meng Wang, Xian-Sheng Hua, “Active learning in multimedia annotation and retrieval: a survey,” ACM Transactions on Intelligent Systems and Technology, vol. 2, no. 2, article 10, 2011. [31] Hector Geffner, Blai Bonet, “Solving large POMDPs using real time dynamic programming,” in Proceedings of AAAI Fall Symposium on POMDPs, 1998. [32] Lucie Daubigney, Milica Gašić, Senthilkumar Chandramohan, Matthieu Geist, Olivier Pietquin, Steve Young, “Uncertainty management for on-line optimisation of a POMDP-based large-scale spoken dialogue system,” in Proceedings of Interspeech, 2011. [33] Yaakov Engel, Shie Mannor, Ron Meir, “Bayes meets Bellman: the Gaussian process approach to temporal difference learning,” in Proceedings of the Twentieth International Conference on Machine Learning, 2003.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/6722	-
dc.description.abstract	本論文提出將口語對話系統（Spoken Dialogue System）的觀念技術用在台大語音實驗室的個人相片檢索系統上，目標是利用系統與使用者之間的多模式對話（Multi-modal Dialogue）互動，幫助使用者更方便且快速的找到自己想要的相片。論文中首先嘗試基於規則的對話系統，並參考在主動學習（Active Learning）領域中常用的分數來設計系統規則。接下來用兩種機器學習的方法，第一種是馬可夫決策程序模型（Markov Decision Process），搭配即時動態規劃（Real-time Dynamic Programming）演算法來做模型訓練。實驗比較三種不同主題分類數的系統狀態，分別搭配不同主題分類或相片的系統動作；當使用主題分類作為系統動作時，實驗中還比較隨機挑選相片與用主動學習分數挑選相片兩種方法的結果。第二種是部份觀測馬可夫決策程序模型（Partially Observable Markov Decision Process），並且使用高斯程序（Gaussian Process）來估算策略值函式。實驗也比較三種不同主題分類數的系統狀態與系統動作，並且使用主動學習分數挑選相片。本論文還進一步嘗試增加擴展查詢詞（Query Expansion）的系統動作，並使用在上述兩種方法上，亦即馬可夫決策程序模型和部份觀測馬可夫決策程序模型上。最後我們並實作混合發動（Mixed Initiative）的對話系統，並且發現適當的讓使用者擁有回答問題以外的動作選擇，的確可以讓系統學習得更好。	zh_TW
dc.description.provenance	Made available in DSpace on 2021-05-17T09:16:54Z (GMT). No. of bitstreams: 1 ntu-101-R99922051-1.pdf: 4516649 bytes, checksum: c88086a8b0460bfb04ade719c11d3c00 (MD5) Previous issue date: 2012	en
dc.description.tableofcontents	誌謝…………………………………………………………………… i 摘要………………………………………………………………… iii 第一章導論………………………………………………………… 1 1.1研究動機……………………………………………………… 1 1.2相關研究…………………………………………………… 2 1.3研究方向與成果…………………………………………… 5 1.4章節安排…………………………………………………… 6 第二章背景知識…………………………………………………… 7 2.1對話系統簡介……………………………………………… 7 2.1.1對話系統的基本假設………………………………… 7 2.1.2對話系統的分類……………………………………… 7 2.1.3對話系統的架構…………………………………… 10 2.1.4總結………………………………………………… 17 2.2個人相片檢索系統簡介…………………………………… 18 2.3本章總結……………………………………………………… 20 第三章實驗資料集………………………………………………… 21 3.1小型相片資料集…………………………………………… 21 3.2大型相片資料集…………………………………………… 23 3.3本章總結…………………………………………………… 26 第四章不同對話系統模型的基本實作…………………………… 27 4.1主動學習與基於規則的對話系統………………………… 27 4.1.1主動學習簡介……………………………………… 27 4.1.2系統規則設計……………………………………… 28 4.1.3實驗設定…………………………………………… 31 4.1.4實驗數據分析與比較……………………………… 34 4.1.5總結………………………………………………… 42 4.2以馬可夫決策程序實作對話系統………………………… 42 4.2.1訓練馬可夫決策程序模型………………………… 43 4.2.2實驗設定…………………………………………… 45 4.2.3各實驗設定的結果分析與比較…………………… 47 4.2.4總結………………………………………………… 56 4.3結合部份觀測馬可夫決策程序與高斯程序的對話系統…… 56 4.3.1訓練部份觀測馬可夫決策程序模型……………… 57 4.3.2實驗設定…………………………………………… 61 4.3.3實驗的結果分析與比較…………………………… 62 4.3.4總結………………………………………………… 66 4.4本章總結…………………………………………………… 66 第五章對話系統中不同系統動作與使用者動作之比較………… 67 5.1同時支援使用者回饋與擴展查詢詞的對話系統………… 67 5.1.1以馬可夫決策程序模型實作………………………… 69 5.1.2以部份觀測馬可夫決策程序模型實作…………… 72 5.1.3總結………………………………………………… 74 5.2混合發動（Mixed Initiative）的對話系統……………… 74 5.2.1混合發動的馬可夫決策程序模型與使用者相關回饋 75 5.2.2混合發動的馬可夫決策程序模型結合擴展查詢詞與使用者相關回饋………………………………………… 78 5.2.3總結………………………………………………… 80 5.3本章總結…………………………………………………… 80 第六章結論與展望………………………………………………… 81 6.1總結………………………………………………………… 81 6.2未來展望…………………………………………………… 82 參考文獻……………………………………………………………… 83
dc.language.iso	zh-TW
dc.title	個人相片檢索系統之多模式對話使用者介面	zh_TW
dc.title	Multi-modal Dialogue User Interface for A Personal Photo Retrieval System	en
dc.type	Thesis
dc.date.schoolyear	100-2
dc.description.degree	碩士
dc.contributor.oralexamcommittee	陳信宏,王小川,簡仁宗,鄭秋豫
dc.subject.keyword	對話,相片檢索,對話管理員,主動學習,加強學習,馬可夫決策程序模型,部份觀測馬可夫決策程序模型,	zh_TW
dc.subject.keyword	dialog,photo retrieval,dialog manager,active learning,reinforcement learning,Markov decision process,partially observable Markov decision process,	en
dc.relation.page	87
dc.rights.note	同意授權(全球公開)
dc.date.accepted	2012-07-31
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	資訊工程學研究所	zh_TW
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
ntu-101-1.pdf	4.41 MB	Adobe PDF	檢視/開啟

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。