具達成內部平衡之決策模型的使用者感知自主服務型機器人

Ching Lin; 林敬

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/3692

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	傅立成(Li-Chen Fu)
dc.contributor.author	Ching Lin	en
dc.contributor.author	林敬	zh_TW
dc.date.accessioned	2021-05-13T08:36:00Z	-
dc.date.available	2019-11-02
dc.date.available	2021-05-13T08:36:00Z	-
dc.date.copyright	2016-11-02
dc.date.issued	2016
dc.date.submitted	2016-08-17
dc.identifier.citation	[1] E. Garcia, M. A. Jimenez, P. G. De Santos, and M. Armada, “The evolution of robotics research,” IEEE Robotics & Automation Magazine, vol. 14, no. 1, pp. 90–103, 2007. [2] “irobot: Your partner for a cleaner home,” 2016, accessed: 10-July-2016. [Online]. Available: www.irobot.com [3] C. Jayawardena, I. H. Kuo, U. Unger, A. Igic, R. Wong, C. I. Watson, R. Stafford, E. Broadbent, P. Tiwari, J. Warren et al., “Deployment of a service robot to help older people,” in Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2010, pp. 5990–5995. [4] C. Jayawardena, I. Kuo, C. Datta, R. Stafford, E. Broadbent, and B. MacDonald, “Design, implementation and field tests of a socially assistive robot for the elderly: Healthbot version 2,” in Proceedings of IEEE RAS & EMBS International Conference on Biomedical Robotics and Biomechatronics (BioRob), 2012, pp. 1837–1842. [5] P. Elinas and J. J. Little, “Decision theoretic task coordination for a visually-guided interactive mobile robot,” in Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2007, pp. 4108–4114. [6] K. Sakai, Y. Yasukawa, Y. Murase, S. Kanda, and N. Sawasaki, “Developing a service robot with communication abilities,” in Proceedings of IEEE International Workshop on Robot and Human Interactive Communication (RO-MAN), 2005, pp. 91–96. [7] T. Breuer, G. R. G. Macedo, R. Hartanto, N. Hochgeschwender, D. Holz, F. Hegger, Z. Jin, C. Müller, J. Paulus, M. Reckhaus et al., “Johnny: An autonomous service robot for domestic environments,” Journal of intelligent & robotic systems, vol. 66, no. 1-2, pp. 245–272, 2012. [8] K. A. Wyrobek, E. H. Berger, H. M. Van der Loos, and J. K. Salisbury, “Towards a personal robotics development platform: Rationale and design of an intrinsically safe personal robot,” in Proceedings of IEEE International Conference on Robotics and Automation (ICRA), 2008, p. 2165–2170. [9] S. Lauria, G. Bugmann, T. Kyriacou, J. Bos, and E. Klein, “Personal robot training via natural-language instructions,” IEEE Intelligent systems, vol. 16, no. 3, pp. 38–45, 2001. [10] J. Pransky, “Service robots–how we should define them?” Service Robot: An International Journal, vol. 2, no. 1, pp. 4–5, 1996. [11] B. T. Clough, “Metrics, schmetrics! how the heck do you determine a uav’s autonomy anyway,” DTIC Document, Tech. Rep., 2002. [12] G. A. Bekey, Autonomous robots: from biological inspiration to implementation and control. MIT press, 2005. [13] L. Cañamero, “Designing emotions for activity selection in autonomous agents,” Emotions in humans and artifacts, vol. 115, p. 148, 2003. [14] K. L. Bellman, “5 emotions: Meaningful mappings between the individual and its world,” Emotions in Humans and Artifacts, p. 149, 2002. [15] T. L. Anderson and M. Donath, “Animal behavior as a paradigm for developing robot autonomy,” Robotics and Autonomous Systems, vol. 6, no. 1, pp. 145–168, 1990. [16] H.-L. Cao, P. G. Esteban, A. De Beir, R. Simut, G. Van de Perre, D. Lefeber, and B. Vanderborght, “Robee: A homeostatic-based social behavior controller for robots in human-robot interaction experiments,” in Proceedings of IEEE International Conference on Robotics and Biomimetics (ROBIO), 2014, pp. 516–521. [17] A. Steinfeld, T. Fong, D. Kaber, M. Lewis, J. Scholtz, A. Schultz, and M. Goodrich, “Common metrics for human-robot interaction,” in Proceedings of the 1st ACM SIGCHI/SIGART conference on Human-robot interaction, 2006, pp. 33–40. [18] J.-H. Hua, S.-P. Ma, and L.-C. Fu, “Human awareness based robot performance learning in a social environment,” in Proceedings of IEEE International Conference on Robotics and Automation (ICRA), 2013, pp. 4291–4296. [19] Á. Castro-González, M. Malfaz, and M. A. Salichs, “Selection of actions for an autonomous social robot,” in Proceedings of International Conference on Social Robotics, 2010, pp. 110–119. [20] M. Malfaz, Á. Castro-González, R. Barber, and M. A. Salichs, “A biologically inspired architecture for an autonomous and social robot,” IEEE Transactions on Autonomous Mental Development, vol. 3, no. 3, pp. 232–246, 2011. [21] S. Feyzabadi and S. Carpin, “Hcmdp: a hierarchical solution to constrained Markov decision processes,” in Proceedings of IEEE International Conference on Robotics and Automation (ICRA), 2015, pp. 3971–3978. [22] S. Omidshafiei, A.-a. Agha-mohammadi, C. Amato, and J. P. How, “Decentralized control of partially observable markov decision processes using belief space macro-actions,” in Proceedings of IEEE International Conference on Robotics and Automation (ICRA), 2015, pp. 5962–5969. [23] D. Liu, M. Cong, Y. Du, and X. Han, “Robotic cognitive behavior control based on biology-inspired episodic memory,” in Proceedings of IEEE International Conference on Robotics and Automation (ICRA), 2015, pp. 5054–5060. [24] S. Zhang and P. Stone, “Corpp: Commonsense reasoning and probabilistic planning, as applied to dialog with a mobile robot.” in Proceedings of AAAI Conference on Artificial Intelligence, 2015, pp. 1394–1400. [25] W.-R. Ko, H.-S. Hyun, H.-J. Kim, S.-H. Choi, and J.-H. Kim, “Behavior selection method for intelligent artificial creatures using the degree of consideration-based mechanism of thought,” in Proceeding of IEEE International Conference on Systems, Man, and Cybernetics (SMC), 2011, pp. 2461–2467. [26] E. B. Smith, “The motion control of a mobile robot using multiobjective decision making,” in Proceedings of the 47th Annual Southeast Regional Conference. ACM, 2009, p. 32. [27] P. Maes, “The agent network architecture (ana),” Acm sigart bulletin, vol. 2, no. 4, pp. 115–120, 1991. [28] F. Bellas, R. J. Duro, A. Faiña, and D. Souto, “Multilevel darwinist brain (mdb): Artificial evolution in a cognitive architecture for real robots,” IEEE Transactions on Autonomous Mental Development, vol. 2, no. 4, pp. 340–354, 2010. [29] T. Ando and M. Kanoh, “A self-sufficiency model using urge system,” in Proceedings of IEEE International Conference on Fuzzy Systems (FUZZ), 2010, pp. 1–6. [30] J. Hoefinghoff, A. Rosenthal-von der Pütten, J. Pauli, and N. Krämer, “You and your robot companion—a framework for creating robotic applications usable by non-experts,” in Proceedings of IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN), 2015, pp. 615–620. [31] J. R. Wilson, “Towards an affective robot capable of being a long-term companion,” in Proceedings of IEEE International Conference on Affective Computing and Intelligent Interaction (ACII), 2015, pp. 754–759. [32] ——, “Robot assistance in medication management tasks,” in Proceedings of ACM/IEEE International Conference on Human-Robot Interaction (HRI), 2016, pp. 643–644. [33] W. B. Cannon, The wisdom of the body. WW Norton & Co, 1932. [34] R. C. Arkin, M. Fujita, T. Takagi, and R. Hasegawa, “An ethological and emotional basis for human–robot interaction,” Robotics and Autonomous Systems, vol. 42, no. 3, pp. 191–201, 2003. [35] C. L. Breazeal, Designing sociable robots. MIT press, 2004. [36] D. Cañamero, “Modeling motivations and emotions as a basis for intelligent behavior,” in Proceedings of the first International Conference on Autonomous Agents. ACM, 1997, pp. 148–155. [37] S. P. Gadanho, “Reinforcement learning in autonomous robots: an empirical investigation of the role of emotions,” 1999. [38] S. C. Gadanho and L. Custódio, “Asynchronous learning by emotions and cognition,” in Proceedings of the seventh International Conference on Simulation of Adaptive Behavior on From Animals to Animats. MIT Press, 2002, pp. 224–225. [39] L. P. Kaelbling, M. L. Littman, and A. R. Cassandra, “Planning and acting in partially observable stochastic domains,” Artificial intelligence, vol. 101, no. 1, pp. 99–134, 1998. [40] R. Bellman, Dynamic Programming, 1st ed. Princeton, NJ, USA: Princeton University Press, 1957. [41] M. L. Puterman, Markov decision processes: discrete stochastic dynamic programming. John Wiley & Sons, 2014. [42] R. S. Sutton and A. G. Barto, Introduction to Reinforcement Learning, 1st ed. Cambridge, MA, USA: MIT Press, 1998. [43] C. J. Watkins and P. Dayan, “Q-learning,” Machine learning, vol. 8, no. 3-4, pp. 279–292, 1992. [44] T. Fong, I. Nourbakhsh, and K. Dautenhahn, “A survey of socially interactive robots,” Robotics and Autonomous Systems, vol. 42, no. 3, pp. 143–166, 2003. [45] A. P. Dempster, N. M. Laird, and D. B. Rubin, “Maximum likelihood from incomplete data via the em algorithm,” Journal of the royal statistical society. Series B (methodological), pp. 1–38, 1977. [46] K. C. Berridge, “Motivation concepts in behavioral neuroscience,” Physiology & behavior, vol. 81, no. 2, pp. 179–209, 2004. [47] K. Lorenz, P. Leyhausen et al., Motivation of human and animal behavior; an ethological view. Van Nostrand Reinhold Company, 1973. [48] A. Elfes, “Occupancy grids: A stochastic spatial representation for active robot perception,” arXiv preprint arXiv:1304.1098, 2013. [49] J. Mason and B. Marthi, “An object-based semantic world model for long-term change detection and semantic querying,” in Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems, 2012, pp. 3851–3858. [50] C. M. Vigorito and A. G. Barto, “Intrinsically motivated hierarchical skill learning in structured environments,” IEEE Transactions on Autonomous Mental Development, vol. 2, no. 2, pp. 132–143, 2010. [51] C. Guestrin, D. Koller, R. Parr, and S. Venkataraman, “Efficient solution algorithms for factored mdps,” Journal of Artificial Intelligence Research, vol. 19, pp. 399–468, 2003. [52] K. P. Murphy, “Dynamic bayesian networks: representation, inference and learning,” Ph.D. dissertation, University of California, Berkeley, 2002. [53] L. Li, T. J. Walsh, and M. L. Littman, “Towards a unified theory of state abstraction for mdps.” in Proceedings of International Symposium on Artificial Intelligence and Mathematics (ISAIM), 2006. [54] R. Givan, T. Dean, and M. Greig, “Equivalence notions and model minimization in markov decision processes,” Artificial Intelligence, vol. 147, no. 1, pp. 163–223, 2003. [55] T. G. Dietterich, “Hierarchical reinforcement learning with the maxq value function decomposition,” J. Artif. Intell. Res.(JAIR), vol. 13, pp. 227–303, 2000. [56] D. Chapman and L. P. Kaelbling, “Input generalization in delayed reinforcement learning: An algorithm and performance comparisons.” in Proceedings of International Joint Conference on Artificial Intelligence (IJCAI), vol. 91, 1991, pp. 726–731. [57] N. K. Jong and P. Stone, “State abstraction discovery from irrelevant state variables.” in Proceedings of International Joint Conference on Artificial Intelligence (IJCAI), 2005, pp. 752–757. [58] M. A. Salichs and M. Malfaz, “A new approach to modeling emotions and their use on a decision-making system for artificial agents,” IEEE Transactions on affective computing, vol. 3, no. 1, pp. 56–68, 2012. [59] Á. Castro-González, M. Malfaz, and M. A. Salichs, “An autonomous social robot in fear,” IEEE Transactions on Autonomous Mental Development, vol. 5, no. 2, pp. 135–151, 2013. [60] C. B. Holroyd and M. G. Coles, “The neural basis of human error processing: reinforcement learning, dopamine, and the error-related negativity.” Psychological review, vol. 109, no. 4, p. 679, 2002. [61] H. A. Murray, “Explorations in personality.” 1938. [62] “Pepper, the humanoid robot from aldebaran, a genuine companion,” 2016, accessed: 12-July-2016. [Online]. Available: https://www.ald.softbankrobotics.com/en/cool-robots/pepper
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/3692	-
dc.description.abstract	服務型機器人需要能夠選擇行為,在缺乏使用者命令下自行進行決策,甚至主動提供服務,才能被稱為「自主」。對於掃地機器人、取物機器人等單用途機器人而言,由於它們有個明確的目標,因此可以人工建構一個完整的決策模型作為它們的行為準則。但是對於多用途機器人而言,他們的目標較為曖昧、模糊,甚至沒有明確目標,此時便較難以為他們建立完整的行為模型,使得它們的自主性較低。「內部平衡理論 (homeostatic drive theory)」是一個在社交機器人中常見的決策理論,它使機器人試著維持其內部狀態的恆定,並根據自身的需求選擇行為。雖然此方法可以提高機器人的自主性,由於此方法忽略了使用者的需求,讓「使用者感知」的能力降低,因此需要調整才能應用於服務型機器人身上。本篇論文將「使用者意圖」以及「使用者回饋」結合至內部平衡理論中,讓決策模型更以使用者為中心,同時保有機器人的高自主性。機器人的內部需求 (drives) 將轉化為動機(motivations),且機器人將同時考慮自身的動機以及使用者的意圖來決定自身的行為。機器人每個行為的效果並非事先定義好的,而是在互動中利用增強式學習 (reinforcement learning) 所得,使得機器人對於環境以及使用者的先前知識的需求都能降到最低。此決策模型於模擬環境中進行測試及訓練,並將機器人在模擬環境中所學知識轉移至真實的機器人進行實地測試。結果顯示機器人在滿足使用者需求的同時也能夠維持自己體內的恆定,提升自主運作時間,同時達成高自主性以及使用者感知能力。	zh_TW
dc.description.abstract	For a service robot to reach high autonomy, it should choose what to do, make it’s own decisions without user command, and even provide service to the user proactively. For single purpose robots, such as object fetching robots or cleaning robots, since a specific goal is given to each of them, the well-structured decision processes could easily proceed, and decision about that task could be made. However, for robots with vague goals or no specific goal at all, such as caring robots or personal service robots, it is harder to construct a general purpose decision process for them, lowering their autonomy. Homeostasis drive theory is a dominating psychological approach in decision making for social robots. A robot adopting this theory would try to maintain its internal status, and act according to its own need. While achieve better autonomy, this approach ignores the needs of its human user, resulting in low degree of human awareness. This work integrated human intention and human feedback into a homeostasis based system, making the decision process more user-centric, while maintaining high autonomy. The robot’s internal needs (drives) generate motivations, and the robot will choose its actions considering both the need of the user and its own motivation. The effects of its actions are not predefined and are learned during interactions by reinforcement learning, making the system require little prior knowledge about the user. The proposed system has been tested in simulations and on a real robot. The results show that the robot can not only satisfy its own needs but also serve the user proactively.	en
dc.description.provenance	Made available in DSpace on 2021-05-13T08:36:00Z (GMT). No. of bitstreams: 1 ntu-105-R03922121-1.pdf: 1738299 bytes, checksum: 83bd17965ae489955bfe574821e4a627 (MD5) Previous issue date: 2016	en
dc.description.tableofcontents	口試委員會審訂書 iii 誌謝 v 摘要 vii Abstract xi 1 Introduction 1 1.1 Background and Motivation . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1.1 Service Robots . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1.2 Autonomy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.1.3 Human Awareness . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.2 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.3 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.4 System Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.5 Thesis Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2 Preliminaries 9 2.1 Markov Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.1.1 Markov Decision Processes . . . . . . . . . . . . . . . . . . . . 10 2.2 Reinforcement Learning . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.2.1 Standard Modeling . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.2.2 Value Functions and Action-Value Functions . . . . . . . . . . . 13 2.2.3 Q-Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.3 Intention Recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 3 Methodology 17 3.1 Terminologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 3.1.1 Drive . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 3.1.2 Stimulus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 3.1.3 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 3.1.4 Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 3.1.5 Action . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.2 System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 3.2.1 Internal and External States . . . . . . . . . . . . . . . . . . . . 24 3.2.2 State Reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 3.2.3 Object-Q Learning . . . . . . . . . . . . . . . . . . . . . . . . . 27 3.2.4 Selecting Action . . . . . . . . . . . . . . . . . . . . . . . . . . 27 3.2.5 Reward and Feedback . . . . . . . . . . . . . . . . . . . . . . . 28 3.2.6 Proposal and Pseudo Update . . . . . . . . . . . . . . . . . . . . 30 3.3 System Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 3.3.1 Drives and Motivations . . . . . . . . . . . . . . . . . . . . . . . 31 3.3.2 Objects, Stimuli and Actions . . . . . . . . . . . . . . . . . . . . 35 3.3.3 Degree of Dedication . . . . . . . . . . . . . . . . . . . . . . . . 36 4 Evaluation 39 4.1 Evaluation Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 4.2 Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 4.2.1 Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 4.2.2 Result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 4.2.3 Effects of Motivation Factor . . . . . . . . . . . . . . . . . . . . 49 4.2.4 Effects of pseudo update . . . . . . . . . . . . . . . . . . . . . . 51 4.2.5 Effects of Degree of Dedication . . . . . . . . . . . . . . . . . . 52 4.3 Field Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 4.3.1 Robot Platform . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 4.3.2 Result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 5 Conclusion 59 Reference 61
dc.language.iso	en
dc.subject	機器人	zh_TW
dc.subject	機器學習	zh_TW
dc.subject	人工智慧	zh_TW
dc.subject	robot	en
dc.subject	artificial intelligence	en
dc.subject	machine learning	en
dc.title	具達成內部平衡之決策模型的使用者感知自主服務型機器人	zh_TW
dc.title	A Homeostasis Based Decision Making System on Human-Aware Autonomous Service Robot	en
dc.type	Thesis
dc.date.schoolyear	104-2
dc.description.degree	碩士
dc.contributor.oralexamcommittee	李琳山(Lin-Shan Lee),蘇木春(Mu-Chun Su),陳永耀(Yung-Yaw Chen),徐國鎧(Kuo-Kai Shyu)
dc.subject.keyword	人工智慧,機器人,機器學習,	zh_TW
dc.subject.keyword	artificial intelligence,robot,machine learning,	en
dc.relation.page	67
dc.identifier.doi	10.6342/NTU201603154
dc.rights.note	同意授權(全球公開)
dc.date.accepted	2016-08-19
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	資訊工程學研究所	zh_TW
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
ntu-105-1.pdf	1.7 MB	Adobe PDF	檢視/開啟

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。