Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 電信工程學研究所
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/72045
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor陳光禎
dc.contributor.authorHsuan-Man Hungen
dc.contributor.author洪瑄蔓zh_TW
dc.date.accessioned2021-06-17T06:20:36Z-
dc.date.available2019-08-21
dc.date.copyright2018-08-21
dc.date.issued2018
dc.date.submitted2018-08-19
dc.identifier.citation[1] C. Boutilier and B. Price. Accelerating reinforcement learning through implicit imitation. CoRR, abs/1106.0681, 2011.
[2] W. Burgard, M. Moors, C. Stachniss, and F. E. Schneider. Coordinated multi-robot exploration. IEEE Transactions on Robotics, 21(3):376-386, June 2005.
[3] L. Busoniu, R. B. ^ska, and B. D. Schutter. A comprehensive survey of multiagent reinforcement learning. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 38(2):156-172, March 2008.
[4] K.-C. Chen, E. Ko, and M. Wu. Networked arti cial intelligence. to appear in the IEEE Network.
[5] K.-C. Chen, T. Zhang, R. Gitlin, and G. Fettweis. Ultra-low latency mobile networking. to appear in the IEEE Network.
[6] J. A. Clouse. Learning from an automated training agent. In Adaptation and Learning in Multiagent Systems. Springer Verlag, 1996.
[7] A. Coates, B. Carpenter, C. Case, S. Satheesh, B. Suresh, T. Wang, D. J. Wu, and A. Y. Ng. Text detection and character recognition in scene images with unsupervised feature learning. In Proceedings of the 2011 International Conference on Document Analysis and Recognition, ICDAR '11, pages 440-445, Washington, DC, USA, 2011. IEEE Computer Society.
[8] J. S. Jennings, G. Whelan, and W. F. Evans. Cooperative search and rescue with a team of mobile robots. In Advanced Robotics, 1997. ICAR '97. Proceedings., 8th international Conference on, pages 193-200, Jul 1997.
[9] F. L. Lewis and D. Vrabie. Reinforcement learning and adaptive dynamic programming for feedback control. IEEE Circuits and Systems Magazine, 9(3):32-50, Third 2009.
[10] S. Russell and P. Norvig. Arti cial Intelligence: A Modern Approach. Prentice Hall Press, Upper Saddle River, NJ, USA, 3rd edition, 2009.
[11] R. S. Sutton and A. G. Barto. Introduction to Reinforcement Learning. MIT Press, Cambridge, MA, USA, 1st edition, 1998.
[12] M. Tan. Multi-agent reinforcement learning: Independent vs. cooperative agents. In In Proceedings of the Tenth International Conference on Machine Learning, pages 330-337. Morgan Kaufmann, 1993.
[13] S. Thrun. Learning occupancy grid maps with forward sensor models. Autonomous Robots, 15(2):111-127, Sep 2003.
[14] A. M. TURING. Computing machinery and intelligence. Mind, LIX(236):433-460, 1950.
[15] C. J. C. H. Watkins and P. Dayan. Q-learning. In Machine Learning, pages 279-292, 1992.
[16] K. M. Wurm, C. Stachniss, and W. Burgard. Coordinated multi-robot exploration using a segmentation of the environment. In 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 1160-1165, Sep 2008.
[17] Y. Xiang, A. Alahi, and S. Savarese. Learning to track: Online multi-object tracking by decision making. In 2015 IEEE International Conference on Computer Vision (ICCV), pages 4705{4713, Dec 2015.
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/72045-
dc.description.abstract人工智慧近年來一直是熱門的研究領域之一,其應用廣泛,無論是日常生活可見的或龐大艱深的問題都藉著人工智慧來尋找解決辦法。隨著智能個體複雜度及數量增加,所組成的多智能系統中個體的互動以及通訊網路為系統所帶來的影響也漸趨複雜。在這篇論文中,我們設計增強式學習之自動掃地機器人,並以單一機器人之學習表現作為基準來衡量多智能系統的學習行為與其所在的通訊環境之間的關係。zh_TW
dc.description.abstractArtificial intelligence has been one of the hottest research trends, developing to solve from problems happen in daily life to complicated ones. As the number of intelligent entities grows, the interactions between them become more complex, particularly when communication comes into play. In this thesis, we investigate the relationships lying behind communication and multi-agent learning systems through practical learning tasks. Using single agent implemented with reinforcement learning as a benchmark, the interplay between various communication scenarios and the learning behaviour in multi-agent systems is explored.en
dc.description.provenanceMade available in DSpace on 2021-06-17T06:20:36Z (GMT). No. of bitstreams: 1
ntu-107-R04942113-1.pdf: 4051561 bytes, checksum: 88e9517bb5f4742813997b782a429649 (MD5)
Previous issue date: 2018
en
dc.description.tableofcontents誌謝. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . i
摘要. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Artificial Intelligence . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Multiagent Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Literature Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.4 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.5 Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2 Reinforcement Learning . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.1 Markov Decision Process . . . . . . . . . . . . . . . . . . . . . . . . 7
2.1.1 Bellman Optimality Principle . . . . . . . . . . . . . . . . . 11
2.2 Q-learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.3 n-step Temporal Dierence . . . . . . . . . . . . . . . . . . . . . . . 16
3 Single Agent Learning Task . . . . . . . . . . . . . . . . . . . . . . . 19
3.1 Problem Description . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.2 Single Agent Reinforcement Learning . . . . . . . . . . . . . . . . . 22
3.2.1 Reinforcement Learning Formulation . . . . . . . . . . . . . 22
3.2.2 Simple RL . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.2.3 Q-learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.2.4 Performance Evaluation: Simple RL versus Q-learning . . . 30
3.2.5 n-step Temporal Dierence Prediction . . . . . . . . . . . . 34
3.2.6 Performance Comparison: Q-learning and n-step TD . . . . 36
4 RL Enhancement with Planning . . . . . . . . . . . . . . . . . . . . 39
4.1 Fixed Length Planning . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.2 Conditional Exhaustive Planning . . . . . . . . . . . . . . . . . . . 43
4.2.1 Conditions for Adopting Planning . . . . . . . . . . . . . . . 43
4.2.2 Planning algorithm: Grassre Algorithm . . . . . . . . . . . 45
4.2.3 Learning NB . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.2.4 Performance of Conditional Exhaustive Planning . . . . . . 51
5 Collaborative Multi-Agent System . . . . . . . . . . . . . . . . . . . 54
5.1 Ideal Communication . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5.1.1 Information Exchange and Integration . . . . . . . . . . . . 55
5.1.2 Performance Measure . . . . . . . . . . . . . . . . . . . . . . 58
5.2 Random Error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
5.3 Multiple Access . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
5.4 Overall Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
6 Conclusions and Future Works . . . . . . . . . . . . . . . . . . . . . 70
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
dc.language.isoen
dc.subject多智能系統zh_TW
dc.subject多代理人系統zh_TW
dc.subject增強式學習zh_TW
dc.subject機器學習zh_TW
dc.subject強化式學習zh_TW
dc.subjectmulti-agent systemsen
dc.subjectreinforcement learningen
dc.subjectmachine learningen
dc.subjectQ-learningen
dc.subjectTD learningen
dc.title協作多智能系統在網路環境之應用zh_TW
dc.titleCommunication in Collaborative Multiagent Systemsen
dc.typeThesis
dc.date.schoolyear106-2
dc.description.degree碩士
dc.contributor.oralexamcommittee林茂昭,連紹宇,曾志成,鄧德雋
dc.subject.keyword多智能系統,多代理人系統,增強式學習,機器學習,強化式學習,zh_TW
dc.subject.keywordmulti-agent systems,reinforcement learning,machine learning,Q-learning,TD learning,en
dc.relation.page72
dc.identifier.doi10.6342/NTU201804013
dc.rights.note有償授權
dc.date.accepted2018-08-19
dc.contributor.author-college電機資訊學院zh_TW
dc.contributor.author-dept電信工程學研究所zh_TW
顯示於系所單位:電信工程學研究所

文件中的檔案:
檔案 大小格式 
ntu-107-1.pdf
  未授權公開取用
3.96 MBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved