Please use this identifier to cite or link to this item:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/88092Full metadata record
| ???org.dspace.app.webui.jsptag.ItemTag.dcfield??? | Value | Language |
|---|---|---|
| dc.contributor.advisor | 徐宏民 | zh_TW |
| dc.contributor.advisor | Winston Hsu | en |
| dc.contributor.author | 林柏劭 | zh_TW |
| dc.contributor.author | Po-Shao Lin | en |
| dc.date.accessioned | 2023-08-08T16:15:43Z | - |
| dc.date.available | 2023-11-09 | - |
| dc.date.copyright | 2023-08-08 | - |
| dc.date.issued | 2023 | - |
| dc.date.submitted | 2023-07-13 | - |
| dc.identifier.citation | [1] J. Andreas, D. Klein, and S. Levine. Modular multitask reinforcement learning with policy sketches. In International Conference on Machine Learning, pages 166–175. PMLR, 2017.
[2] M. Andrychowicz, F. Wolski, A. Ray, J. Schneider, R. Fong, P. Welinder, B. McGrew, J. Tobin, O. Pieter Abbeel, and W. Zaremba. Hindsight experience replay. Advances in neural information processing systems, 30, 2017. [3] D. Borsa, T. Graepel, and J. Shawe-Taylor. Learning shared representations in multitask reinforcement learning. arXiv preprint arXiv:1603.02041, 2016. [4] D. Calandriello, A. Lazaric, and M. Restelli. Sparse multi-task reinforcement learning. Advances in neural information processing systems, 27, 2014. [5] R. Caruana. Multitask learning. Machine learning, 28:41–75, 1997. [6] C. D’Eramo, D. Tateo, A. Bonarini, M. Restelli, J. Peters, et al. Sharing knowledge in multi-task deep reinforcement learning. In 8th International Conference on Learning Representations,{ICLR} 2020, Addis Ababa, Ethiopia, April 26-30, 2020, pages 1–11. OpenReview. net, 2020. [7] C. Devin, A. Gupta, T. Darrell, P. Abbeel, and S. Levine. Learning modular neural network policies for multi-task and multi-robot transfer. In 2017 IEEE international conference on robotics and automation (ICRA), pages 2169–2176. IEEE, 2017. [8] T. Haarnoja, A. Zhou, P. Abbeel, and S. Levine. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. In International conference on machine learning, pages 1861–1870. PMLR, 2018. [9] J. Meng and F. Zhu. Seek for commonalities: Shared features extraction for multi-task reinforcement learning via adversarial training. Expert Systems with Applications, 224:119975, 2023. [10] V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra, and M. Riedmiller. Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602, 2013. [11] L. Pinto and A. Gupta. Learning to push by grasping: Using multiple tasks foreffective learning. In 2017 IEEE international conference on robotics and automation (ICRA), pages 2161–2168. IEEE, 2017. [12] S. Ruder. An overview of multi-task learning in deep neural networks. arXiv preprint arXiv:1706.05098, 2017. [13] T. Schaul, J. Quan, I. Antonoglou, and D. Silver. Prioritized experience replay. arXiv preprint arXiv:1511.05952, 2015. [14] O. Sener and V. Koltun. Multi-task learning as multi-objective optimization. Advances in neural information processing systems, 31, 2018. [15] S. Sodhani, A. Zhang, and J. Pineau. Multi-task reinforcement learning with context based representations. In International Conference on Machine Learning, pages 9767–9779. PMLR, 2021. [16] L. Sun, H. Zhang, W. Xu, and M. Tomizuka. Paco: Parameter-compositional multitask reinforcement learning. arXiv preprint arXiv:2210.11653, 2022. [17] F. Tanaka and M. Yamamura. Multitask reinforcement learning on the distribution of mdps. In Proceedings 2003 IEEE international symposium on computational intelligence in robotics and automation. Computational intelligence in robotics and automation for the new millennium (Cat. No. 03EX694), volume 3, pages 1108–12. IEEE, 2003. [18] N. Vithayathil Varghese and Q. H. Mahmoud. A survey of multi-task deep reinforcement learning. Electronics, 9(9):1363, 2020. [19] Z. Wang, V. Bapst, N. Heess, V. Mnih, R. Munos, K. Kavukcuoglu, and N. de Freitas. Sample efficient actor-critic with experience replay. arXiv preprint arXiv:1611.01224, 2016. [20] A. Wilson, A. Fern, S. Ray, and P. Tadepalli. Multi-task reinforcement learning: a hierarchical bayesian approach. In Proceedings of the 24th international conference on Machine learning, pages 1015–1022, 2007. [21] R. Yang, H. Xu, Y. Wu, and X. Wang. Multi-task reinforcement learning with soft modularization. Advances in Neural Information Processing Systems, 33:4767–4777, 2020. [22] T. Yu, S. Kumar, A. Gupta, S. Levine, K. Hausman, and C. Finn. Gradient surgery for multi-task learning. Advances in Neural Information Processing Systems, 33:5824–5836, 2020. [23] T. Yu, D. Quillen, Z. He, R. Julian, K. Hausman, C. Finn, and S. Levine. Metaworld: A benchmark and evaluation for multi-task and meta reinforcement learning. In Conference on robot learning, pages 1094–1100. PMLR, 2020. [24] D. Zha, K.-H. Lai, K. Zhou, and X. Hu. Experience replay optimization. arXiv preprint arXiv:1906.08387, 2019. [25] Y. Zhang and Q. Yang. An overview of multi-task learning. National Science Review, 5(1):30–43, 2018. [26] Y. Zhang and Q. Yang. A survey on multi-task learning. IEEE Transactions on Knowledge and Data Engineering, 34(12):5586–5609, 2021. | - |
| dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/88092 | - |
| dc.description.abstract | 多任務強化學習已成為一個具有挑戰性的問題,旨在減少強化學習的計算成本並利用任務之間的共享特徵來提高個別任務的性能。然而,一個關鍵挑戰在於確定應該在任務之間共享哪些特徵以及如何保留區分每個任務的獨特特徵。這個挑戰常常導致任務性能不平衡的問題,其中某些任務可能主導學習過程,而其他
任務則被忽視。在本文中,我們提出了一種新方法,稱為共享獨特特徵和任務感知的優先經驗重放,以提高訓練穩定性並有效利用共享和獨特特徵。我們引入了一種簡單而有效的任務特定嵌入方法,以保留每個任務的獨特特徵,以減輕任務性能不平衡的潛在問題。此外,我們將任務感知設置引入到優先經驗重放算法中,以適應多任務訓練並增強訓練的穩定性。我們的方法在 MetaWorld 的資料測試集中實現了最先進的平均成功率,同時在所有任務上保持穩定的性能,避免了任務性能不平衡的問題。結果證明了我們的方法在應對多任務強化學習挑戰方面的有效性。 | zh_TW |
| dc.description.abstract | Multi-task reinforcement learning (MTRL) has emerged as a challenging problem to reduce the computational cost of reinforcement learning and leverage shared features among tasks to improve the performance of individual tasks.
However, a key challenge lies in determining which features should be shared across tasks and how to preserve the unique features that differentiate each task. This challenge often leads to the problem of task performance imbalance, where certain tasks may dominate the learning process while others are neglected. In this paper, we propose a novel approach called shared-unique features along with task-aware prioritized experience replay to improve training stability and leverage shared and unique features effectively. We incorporate a simple yet effective task-specific embeddings to preserve the unique features of each task to mitigate the potential problem of task performance imbalance. Additionally, we introduce task-aware settings to the prioritized experience replay (PER) algorithm to accommodate multi-task training and enhancing training stability. Our approach achieves state-of-the-art average success rates on the Meta-World benchmark, while maintaining stable performance across all tasks, avoiding task performance imbalance issues. The results demonstrate the effectiveness of our method in addressing the challenges of MTRL. | en |
| dc.description.provenance | Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2023-08-08T16:15:43Z No. of bitstreams: 0 | en |
| dc.description.provenance | Made available in DSpace on 2023-08-08T16:15:43Z (GMT). No. of bitstreams: 0 | en |
| dc.description.tableofcontents | Verification Letter from the Oral Examination Committee i
Acknowledgements iii 摘要 v Abstract vii Contents ix List of Figures xi List of Tables xiii Denotation xv Chapter 1 Introduction 1 Chapter 2 Related Work 5 2.1 MultiTask Learning . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.2 MultiTask Reinforcement Learning . . . . . . . . . . . . . . . . . . 5 2.3 OffPolicy Reinforcement Learning and Experience Replay . . . . . 6 Chapter 3 Method 9 3.1 Problem Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 3.2 SharedUnique Features . . . . . . . . . . . . . . . . . . . . . . . . 9 3.3 TaskAware Prioritized Experience Replay . . . . . . . . . . . . . . 11 Chapter 4 Experimental Results 13 4.1 Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 4.2 Results on Benchmark . . . . . . . . . . . . . . . . . . . . . . . . . 14 4.3 Task Imbalance Performance Problem . . . . . . . . . . . . . . . . . 15 4.4 Ablation Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 Chapter 5 Conclusion and Limitations 19 References 21 Appendix A — Experiment Details 25 A.1 Comparison of each task’s performance in baselines . . . . . . . . . 25 A.2 Results on MT50 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 | - |
| dc.language.iso | en | - |
| dc.subject | 多任務強化學習 | zh_TW |
| dc.subject | 共享-獨特特徵 | zh_TW |
| dc.subject | 經驗回放 | zh_TW |
| dc.subject | 機器人學 | zh_TW |
| dc.subject | 深度學習 | zh_TW |
| dc.subject | Experience Replay | en |
| dc.subject | Shared-Unique Features | en |
| dc.subject | Robotics | en |
| dc.subject | Multi-Task Reinforcement Learning | en |
| dc.subject | Deep Learning | en |
| dc.title | 具有共享及獨特特徵和優先權經驗回放的多任務強化學習 | zh_TW |
| dc.title | Multi-Task Reinforcement Learning with Shared-Unique Features and Task-Aware Prioritized Experience Replay | en |
| dc.type | Thesis | - |
| dc.date.schoolyear | 111-2 | - |
| dc.description.degree | 碩士 | - |
| dc.contributor.oralexamcommittee | 陳文進;葉梅珍;陳奕廷 | zh_TW |
| dc.contributor.oralexamcommittee | Wen-Chin Chen;Mei-Chen Yeh;Yi-Ting Chen | en |
| dc.subject.keyword | 多任務強化學習,經驗回放,共享-獨特特徵,機器人學,深度學習, | zh_TW |
| dc.subject.keyword | Multi-Task Reinforcement Learning,Experience Replay,Shared-Unique Features,Robotics,Deep Learning, | en |
| dc.relation.page | 27 | - |
| dc.identifier.doi | 10.6342/NTU202301033 | - |
| dc.rights.note | 同意授權(限校園內公開) | - |
| dc.date.accepted | 2023-07-14 | - |
| dc.contributor.author-college | 電機資訊學院 | - |
| dc.contributor.author-dept | 資訊工程學系 | - |
| dc.date.embargo-lift | 2028-07-11 | - |
| Appears in Collections: | 資訊工程學系 | |
Files in This Item:
| File | Size | Format | |
|---|---|---|---|
| ntu-111-2.pdf Restricted Access | 4.26 MB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.
