回憶個人照片之互動式問題提出系統

Yi-Luen Wu; 吳宜倫

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/66998

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	傅立成
dc.contributor.author	Yi-Luen Wu	en
dc.contributor.author	吳宜倫	zh_TW
dc.date.accessioned	2021-06-17T01:16:49Z	-
dc.date.available	2020-08-25
dc.date.copyright	2017-08-25
dc.date.issued	2017
dc.date.submitted	2017-08-14
dc.identifier.citation	[1] E. Fusco, Effective questioning strategies in the classroom: A step-by-step approach to engaged thinking and learning, K-8. Teachers College Press, 2012. [2] M. D. Zeiler and R. Fergus, “Visualizing and understanding convolutional networks,” in European Conference on Computer Vision, 2014, pp. 818–833. [3] F. B. Bryant, C. M. Smart, and S. P. King, “Using the past to enhance the present: Boosting happiness through positive reminiscence,” Journal of Happiness Studies, vol. 6, no. 3, pp. 227–260, 2005. [4] R. Fivush and K. Nelson, “Culture and language in the emergence of autobiographical memory,” Psychological Science, vol. 15, no. 9, pp. 573–577, 2004. [5] R. Fivush, “The development of autobiographical memory,” Annual Review of Psychology, vol. 62, pp. 559–582, 2011. [6] R. C. Atkinson and R. M. Shiffrin, “Human memory: A proposed system and its control processes,” Psychology of Learning and Motivation, vol. 2, pp. 89–195, 1968. [7] F. C. Bartlett and C. Burt, “Remembering: A study in experimental and social psychology,” British Journal of Educational Psychology, vol. 3, no. 2, pp. 187–192, 1933. [8] R. Schank and R. Abelson, Scripts, plans, goals, and understanding: An inquiry into human knowledge structures, 1977. [9] J. M. Mandler and N. S. Johnson, “Remembrance of things parsed: Story structure and recall,” Cognitive Psychology, vol. 9, no. 1, pp. 111–151, 1977. [10] P. W. Thorndyke, “Cognitive structures in comprehension and memory of narrative discourse,” Cognitive Psychology, vol. 9, no. 1, pp. 77–110, 1977. [11] B. J. Reiser, J. B. Black, and R. P. Abelson, “Knowledge structures in the organization and retrieval of autobiographical memories,” Cognitive Psychology, vol. 17, pp. 89–137, 1985. [12] J. L. Kolodner, “Reconstructive memory: A computer model,” Cognitive Science, vol. 7, no. 4, pp. 281–328, 1983. [13] N. Kuwahara, S. Abe, K. Yasuda, and K. Kuwabara, “Networked reminiscence therapy for individuals with dementia by using photo and video sharing,” in Proceedings of the 8th International ACM SIGACCESS Conference on Computers and Accessibility, 2006, pp. 125–132. [14] N. Alm, R. Dye, G. Gowans, J. Campbell, A. Astell, and M. Ellis, “A communication support system for older people with dementia,” IEEE Computer, vol. 40, no. 5, pp. 35–41, 2007. [15] S. McCarthy, H. Sayers, P. McKevitt, and M. McTear, “MemoryLane: Reminiscence for older adults,” in CEUR Workshop Proceedings, vol. 499, 2009, pp. 22–27. [16] S. T. Peesapati, V. Schwanda, J. Schultz, M. Lepage, S.-y. Jeong, and D. Cosley, “Pensieve: Supporting everyday reminiscence,” in Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, 2010, pp. 2027–2036. [17] T. Okada, M. Nihei, T. Narita, and M. Kamata, “Conversational system encouraging communication of the aged by method of reminiscence and quantification of active participation,” in International Conference on Universal Access in Human-Computer Interaction, 2013, pp. 191–200. [18] J. Campos and A. Paiva, “MAY : My memories are yours,” in International Conference on Intelligent Virtual Agents, 2010, pp. 406–412. [19] J. Carvalho and F. D. Campos, “May: My memories are yours. An interactive companion that saves the user’s memories,” Ph.D. dissertation, 2010. [20] M. A. Conway, “Sensory-perceptual episodic memory and its context: autobiographical memory,” Philosophical Transactions of the Royal Society B: Biological Sciences, vol. 356, no. 1413, pp. 1375–1384, 2001. [21] Y. Wilks, R. Catizone, S. Worgan, A. Dingli, R. Moore, D. Field, and W. Cheng, “A prototype for a conversational companion for reminiscing about images,” Computer Speech and Language, vol. 25, no. 2, pp. 140–157, 2011. [22] O. Vinyals, A. Toshev, S. Bengio, and D. Erhan, “Show and tell: A neural image caption generator,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015. [23] A. Karpathy and F.-F. Li, “Deep visual-semantic alignments for generating image descriptions,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015. [24] S. Antol, A. Agrawal, J. Lu, M. Mitchell, D. Batra, C. L. Zitnick, and D. Parikh, “Vqa: Visual question answering,” in Proceedings of the IEEE International Conference on Computer Vision, 2015, pp. 2425–2433. [25] N. Mostafazadeh, I. Misra, J. Devlin, M. Mitchell, X. He, and L. Vanderwende, “Generating natural questions about an image,” arXiv preprint arXiv:1603.06059, pp. 1802–1813, 2016. [26] N. Mostafazadeh, C. Brockett, B. Dolan, M. Galley, J. Gao, G. P. Spithourakis, and L. Vanderwende, “Image-grounded conversations: Multimodal context for natural question and response generation,” arXiv preprint arXiv:1701.08251, 2017. [27] S. Aditya, Y. Ang, C. Baral, C. Fermuller, and Y. Aloimonos, “From images to sentences through scene description graphs using reasoning and knowledge,” arXiv preprint arXiv:1511.03292, 2015. [28] G. A. Miller, “WordNet: A lexical database for English,” Communications of the ACM, vol. 38, no. 11, pp. 39–41, 1995. [29] P. Wang, Q. Wu, C. Shen, A. Van Den Hengel, and A. Dick, “Explicit knowledgebased reasoning for visual question answering,” arXiv preprint arXiv:1511.02570, 2015. [30] C. Bizer, G. Kobilarov, J. Lehmann, R. Cyganiak, and Z. Ives, “DBpedia: A nucleus for a web of open data,” The Semantic Web, pp. 722–735, 2007. [31] S. Rakangor and D. Y. R. Ghodasara, “Literature review of automatic question generation systems,” International Journal of Scientific and Research Publications, pp. 1–5, 2015. [32] V. K. Chaudhri, P. E. Clark, A. Overholtzer, and A. Spaulding, “Question generation from a knowledge base,” in International Conference on Knowledge Engineering and Knowledge Management, 2014. [33] L. Song and L. Zhao, “Question generation from a knowledge base with web exploration,” arXiv:1610.03807v2 [cs.CL], 2017. [34] H. Liu and P. Singh, “ConceptNet - A practical commonsense reasoning tool-kit,” BT Technology Journal, vol. 22, no. 4, pp. 211–226, 2004. [35] E. Cambria, R. Speer, C. Havasi, and A. Hussain, “SenticNet : A publicly available semantic resource for opinion mining,” in AAAI Fall Symposium: Commonsense Knowledge, vol. 10, no. 0, 2010. [36] K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv preprint arXiv:1409.1556, 2014. [37] F. M. Suchanek, G. Kasneci, and G. Weikum, “Yago: A core of semantic knowledge,” in Proceedings of the 16th International Conference on World Wide Web, 2007, pp. 697–706. [38] A. Singhal, “Introducing the Knowledge Graph: things, not strings,” 2012. [Online]. Available: https://googleblog.blogspot.tw/2012/05/introducing-knowledge-graphthings-not.html [39] D. Lenat, “CYC: A large-scale investment in knowledge infrastructure,” Communications of the ACM, vol. 38, no. 11, pp. 33–38, 1995. [40] N. Tandon, G. de Melo, F. Suchanek, and G. Weikum, “Webchild: Harvesting and organizing commonsense knowledge from the web,” in Proceedings of the 7th ACM International Conference on Web Search and Data Mining, 2014, pp. 523–532. [41] R. Speer, J. Chin, and C. Havasi, “ConceptNet 5.5: An open multilingual graph of general knowledge,” arXiv preprint arXiv:1612.03975, 2016. [42] E. Cambria, S. Poria, and R. Bajpai, “SenticNet 4 : A semantic resource for sentiment analysis based on conceptual primitives,” in Proceedings of the 26th International Conference on Computational Linguistics, 2016, pp. 2666–2677. [43] T. Mikolov, I. Sutskever, K. Chen, G. Corrado, and J. Dean, “Distributed representations of words and phrases and their compositionality,” Advances in Neural Information Processing Systems, pp. 3111–3119, 2013. [44] T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient estimation of word representations in vector space,” arXiv preprint arXiv:1301.3781, 2013. [45] J. Pennington, R. Socher, and C. D. Manning, “Glove: Global vectors for word representation,” in Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, 2014, pp. 1532–1543. [46] X. Chen, L. Xu, Z. Liu, M. Sun, and H. Luan, “Joint learning of character and word embeddings,” in International Joint Conference on Artificial Intelligence, 2015, pp. 1236–1242. [47] T. Heskes, “Stable fixed points of loopy belief propagation are minima of the Bethe free energy,” Advances in Neural Information Processing Systems, pp. 359–366, 2003. [48] R. Szeliski, R. Zabih, D. Scharstein, O. Veksler, V. Kolmogorov, A. Agarwala, M. Tappen, and C. Rother, “A comparative study of energy minimization methods for Markov random fields,” in European Conference on Computer Vision, 2006, pp. 16–29. [49] J. M. Zacks, T. S. Braver, M. A. Sheridan, D. I. Donaldson, A. Z. Snyder, J. M. Ollinger, R. L. Buckner, and M. E. Raichle, “Human brain activity time-locked to perceptual event boundaries,” Nature Neuroscience, vol. 4, no. 6, pp. 651–655, 2001. [50] X. Liu, M. Wang, and B. Huet, “Event analysis in social multimedia: a survey,” Frontiers of Computer Science, vol. 10, no. 3, pp. 433–446, 2016. [51] R. Poppe, “A survey on vision-based human action recognition,” Image and Vision Computing, vol. 28, no. 6, pp. 976–990, 2010. [52] P. Turaga, R. Chellappa, V. S. Subrahmanian, and O. Udrea, “Machine recognition of human activities: A survey,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 18, no. 11, pp. 1473–1488, 2008. [53] K. Ahmad, N. Conci, G. Boato, and F. G. B. De Natale, “USED: A large-scale social event detection dataset,” in Proceedings of the 7th International Conference on Multimedia Systems, 2016, pp. 50:1–50:6. [54] R. Mattivi, J. Uijlings, F. G. De Natale, and N. Sebe, “Exploitation of time constraints for (sub-) event recognition,” in Proceedings of the 2011 Joint ACM Workshop on Modeling and Representing Events, 2011, pp. 7–12. [55] G. Petkos, S. Papadopoulos, V. Mezaris, and Y. Kompatsiaris, “Social event detection at MediaEval 2014: Challenges, datasets, and evaluation.” in MediaEval, 2014. [56] Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, and T. Darrell, “Caffe: Convolutional architecture for fast feature embedding,” in Proceedings of the ACM International Conference on Multimedia, 2014, pp. 675–678. [57] B. Zhou, A. Khosla, A. Lapedriza, A. Torralba, and A. Oliva, “Places: An image database for deep scene understanding,” arXiv preprint arXiv:1610.02055, 2016. [58] W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-y. Fu, and A. C. Berg, “SSD: Single Shot MultiBox Detector,” in European conference on computer vision, 2016, pp. 21–37. [59] T. Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, and C. L. Zitnick, “Microsoft COCO: Common objects in context,” in European Conference on Computer Vision, vol. 8693 LNCS, no. PART 5, may 2014, pp. 740–755. [60] N. Tandon, G. D. Melo, A. De, and G. Weikum, “Knowlywood: Mining activity knowledge from hollywood narratives,” in Proceedings of the 24th ACM International on Conference on Information and Knowledge Management, 2015, pp. 223–232. [61] M. Schmidt, “UGM: A Matlab toolbox for probabilistic undirected graphical models.” 2007. [Online]. Available: http://www.cs.ubc.ca/{~}schmidtm/Software/UGM.html [62] S.-M. Wang, C.-H. Li, Y.-C. Lo, T.-H. K. Huang, and L.-W. Ku, “Sensing emotions in text messages: An application and deployment study of EmotionPush,” arXiv preprint arXiv:1610.04758, 2016. [63] H. Duan, Y. Cao, C.-y. Lin, and Y. Yu, “Searching questions by identifying question topic and question focus,” in Proceedings of 46th Annual Meeting of the Association for Computational Linguistics: Human Language Tchnologies, 2008, pp. 156–164. [64] C. Liu, C. T. Ishi, H. Ishiguro, and N. Hagita, “Generation of nodding, head tilting and eye gazing for human-robot dialogue interaction,” in Proceedings of the 7th ACM/IEEE International Conference on Human-Robot Interaction, 2012, p. 292. [65] R. F. Rachmadi, K. Uchimura, and G. Koutaki, “Spatial pyramid convolutional neural network for social event detection in static image,” arXiv preprint arXiv: 1612.04062, 2016. [66] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and Li Fei-Fei, “ImageNet: A largescale hierarchical image database,” in IEEE Conference on Computer Vision and Pattern Recognition, 2009, pp. 248–255. [67] A. Steinfeld, T. Fong, D. Kaber, M. Lewis, J. Scholtz, A. Schultz, and M. Goodrich, “Common metrics for human-robot interaction,” in Proceeding of the 1st ACM SIGCHI/SIGART Conference on Human-Robot Interaction, 2006, pp. 33–40. [68] R. R. Murphy and D. Schreckenghost, “Survey of metrics for human-robot interaction,” in Proceedings of the 8th ACM/IEEE International Conference on Human-Robot Interaction, 2013, pp. 197–198. [69] K. Jokinen and G. Wilcock, “Modelling user experience in human-robot interactions,” in International Workshop on Multimodal Analyses Enabling Artificial Agents in Human-Machine Interaction, 2014, pp. 45–56.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/66998	-
dc.description.abstract	回憶是我們一生當中都一直在從事的活動，我們時常回想過去曾經做過的事情。記憶往往可以作為人們聊天的話題來源，回顧過去的同時也能幫助人們建立自尊心、感受快樂與幸福。在本論文中，我們的目標是開發一個能夠幫助人們從照片中回想起過去回憶的陪伴機器人。我們專注於機器人如何聯想與相片內容相關的概念，並透過詢問相關且吸引人的問題來喚起人們的記憶。為了瞭解照片中的內容，我們應用深度學習的技術來辨識影像中的活動、物體及場景，而後在包含活動常識的馬可夫隨機場中，考慮來自照片及使用者話語的觀察結果，使用循環置信傳播演算法來推斷可能聯想到的概念與話題。之後，機器人根據選擇的話題來提出適當的問題，並在互動中引導使用者回想回憶。最後，我們通過精心設計的實驗評估我們的系統，實驗結果顯示，我們提出的系統能夠提出適當且相關的問題，並且有潛力能夠幫助使用者以有組織性的方式回憶過去。	zh_TW
dc.description.abstract	Reminiscence is a lifelong activity that happens throughout our lifespan. While memories can serve as the topics in people's chit-chat, recalling the past can also help people to build self-esteem and increase the level of happiness. In this thesis, we aim to develop a companion robot that helps people to recollect the memories from the personal photos. We focus on how a robot could associate concepts relevant to the content in the photos and evoke people's memory by asking related and engaging questions. To understand the content in the photo, we apply deep learning techniques to recognize events, objects, and scenes in the image. Then, the observations from the photo and the user utterance are considered in the Markov random field that contains commonsense knowledge of events, whereby the loopy belief propagation is used to infer possible associated concepts and topics. Afterwards, appropriate questions about the selected topics are posed to guide the user to remember the memories in the interaction. Finally, we evaluate our system through well designed experiments. The results show that the proposed system can pose proper and related questions to interact with the user, and has a potential to help and guide the user to recall the past in an organized way.	en
dc.description.provenance	Made available in DSpace on 2021-06-17T01:16:49Z (GMT). No. of bitstreams: 1 ntu-106-R04922068-1.pdf: 6440653 bytes, checksum: 1d049871c4e6cdb445452122341b803d (MD5) Previous issue date: 2017	en
dc.description.tableofcontents	口試委員審定書 i 誌謝 ii 摘要 iv Abstract v Contents vi List of Figures ix List of Tables xi 1 Introduction 1 1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.3 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.3.1 Memory Retrieval in Human Memory System . . . . . . . . . . . 3 1.3.2 Reminiscence System . . . . . . . . . . . . . . . . . . . . . . . 7 1.3.3 Visual Understanding and Question Generation . . . . . . . . . . 10 1.4 Objective and Contribution . . . . . . . . . . . . . . . . . . . . . . . . . 13 1.5 Thesis Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2 Preliminary 15 2.1 Convolutional Neural Network . . . . . . . . . . . . . . . . . . . . . . . 15 2.1.1 Convolutional Layer . . . . . . . . . . . . . . . . . . . . . . . . 15 2.1.2 Pooling Layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.1.3 Fully Connected Layer . . . . . . . . . . . . . . . . . . . . . . . 17 2.1.4 VGG-16 Net . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.2 Commonsense Knowledge . . . . . . . . . . . . . . . . . . . . . . . . . 18 2.2.1 ConceptNet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.2.2 SenticNet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 2.2.3 Word Embedding . . . . . . . . . . . . . . . . . . . . . . . . . . 21 2.3 Graphical Model Modeling and Inference . . . . . . . . . . . . . . . . . 23 2.3.1 Markov Random Field . . . . . . . . . . . . . . . . . . . . . . . 23 2.3.2 Factor Graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 2.3.3 Loopy Belief Propagation . . . . . . . . . . . . . . . . . . . . . 25 3 Interactive Question-Posing System 27 3.1 System Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 3.2 Image Understanding . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 3.3 Concept Association Knowledge Graph . . . . . . . . . . . . . . . . . . 32 3.3.1 Knowledge Graph Structure and Construction . . . . . . . . . . . 34 3.3.2 Appropriateness of Topic . . . . . . . . . . . . . . . . . . . . . . 36 3.3.3 Association of Relevant Concepts . . . . . . . . . . . . . . . . . 37 3.4 Modeling Concept Association and Topic Appropriateness . . . . . . . . 39 3.4.1 Model Construction . . . . . . . . . . . . . . . . . . . . . . . . 39 3.4.2 Concept Inference . . . . . . . . . . . . . . . . . . . . . . . . . 41 3.5 Interaction Management . . . . . . . . . . . . . . . . . . . . . . . . . . 43 3.5.1 Decision Making . . . . . . . . . . . . . . . . . . . . . . . . . . 45 3.5.2 Topic Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 3.5.3 Question-Posing . . . . . . . . . . . . . . . . . . . . . . . . . . 49 3.5.4 Follow-up Response Generation . . . . . . . . . . . . . . . . . . 51 4 Evaluation 52 4.1 Data Collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 4.1.1 Question Collection . . . . . . . . . . . . . . . . . . . . . . . . 53 4.1.2 Personal Photo Collection . . . . . . . . . . . . . . . . . . . . . 53 4.2 Event Recognition Evaluation . . . . . . . . . . . . . . . . . . . . . . . 54 4.2.1 Data Description . . . . . . . . . . . . . . . . . . . . . . . . . . 54 4.2.2 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . 55 4.3 Concept Inference Model Evaluation . . . . . . . . . . . . . . . . . . . . 57 4.3.1 Participants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 4.3.2 Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 4.3.3 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . 59 4.4 Human-Robot Interaction Experiment . . . . . . . . . . . . . . . . . . . 60 4.4.1 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . . 60 4.4.2 Reminiscence Strategies . . . . . . . . . . . . . . . . . . . . . . 63 4.4.3 Questionnaire . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 4.4.4 Participants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 4.4.5 Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 4.4.6 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . 67 5 Conclusion 73 References 75
dc.language.iso	en
dc.subject	懷舊	zh_TW
dc.subject	社交陪伴機器人	zh_TW
dc.subject	馬可夫隨機場	zh_TW
dc.subject	知識圖	zh_TW
dc.subject	互動式提問	zh_TW
dc.subject	Markov Random Fields	en
dc.subject	Reminiscence	en
dc.subject	Social Companion Robot	en
dc.subject	Interactive Question-Posing	en
dc.subject	Knowledge Graph	en
dc.title	回憶個人照片之互動式問題提出系統	zh_TW
dc.title	Interactive Question-Posing System for Reminiscing about Personal Photos	en
dc.type	Thesis
dc.date.schoolyear	105-2
dc.description.degree	碩士
dc.contributor.oralexamcommittee	李蔡彥,蘇木春,項天瑞,葉素玲
dc.subject.keyword	懷舊,社交陪伴機器人,馬可夫隨機場,知識圖,互動式提問,	zh_TW
dc.subject.keyword	Reminiscence,Social Companion Robot,Markov Random Fields,Knowledge Graph,Interactive Question-Posing,	en
dc.relation.page	82
dc.identifier.doi	10.6342/NTU201702347
dc.rights.note	有償授權
dc.date.accepted	2017-08-14
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	資訊工程學研究所	zh_TW
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
ntu-106-1.pdf 未授權公開取用	6.29 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。