有限回授之多天線非正交多工接取系統下使用強化學習選擇調變編碼模式

Ya-Han Yang; 楊雅涵

Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/819

Full metadata record

???org.dspace.app.webui.jsptag.ItemTag.dcfield???	Value	Language
dc.contributor.advisor	謝宏昀
dc.contributor.author	Ya-Han Yang	en
dc.contributor.author	楊雅涵	zh_TW
dc.date.accessioned	2021-05-11T05:07:48Z	-
dc.date.available	2020-02-15
dc.date.available	2021-05-11T05:07:48Z	-
dc.date.copyright	2019-02-15
dc.date.issued	2019
dc.date.submitted	2019-02-14
dc.identifier.citation	[1] N. Wooseok, B. Dongwoon, L. Jungwon, and K. Inyup, “Advanced interference management for 5g cellular networks,” IEEE Communications Magazine, vol. 52, no. 5, pp. 52–60, 2014. [2] T. Yoo, N. Jindal, and A. Goldsmith, “Multi-antenna downlink channels with limited feedback and user selection,” IEEE Journal on Selected Areas in Communications, vol. 25, no. 7, 2007. [3] M. Trivellato, F. Boccardi, and F. Tosato, “User selection schemes for mimo broadcast channels with limited feedback,” in Vehicular Technology Conference, 2007. VTC2007-Spring. IEEE 65th. IEEE, 2007, pp. 2089–2093. [4] J. Schaepperle and A. Regg, “Enhancement of throughput and fairness in 4g wireless access systems by non-orthogonal signaling,” Bell Labs Technical Journal, vol. 13, no. 4, pp. 59–77, 2009. [5] M.-J. Yang and H.-Y. Hsieh, “Moving towards non-orthogonal multiple access in next-generation wireless access networks,” in Communications (ICC), 2015 IEEE International Conference on. IEEE, 2015, pp. 5633–5638. [6] S. Timotheou and I. Krikidis, “Fairness for non-orthogonal multiple access in 5g systems,” IEEE Signal Processing Letters, vol. 22, no. 10, pp. 1647–1651,2015. [7] H. Sun, Y. Xu, and R. Q. Hu, “A noma and mu-mimo supported cellular network with underlaid d2d communications,” in Vehicular Technology Conference (VTC Spring), 2016 IEEE 83rd. IEEE, 2016, pp. 1–5. [8] B. Kim, S. Lim, H. Kim, S. Suh, J. Kwun, S. Choi, C. Lee, S. Lee, and D. Hong, “Non-orthogonal multiple access in a downlink multiuser beamforming system,” in Military Communications Conference, MILCOM 2013-2013 IEEE. IEEE, 2013, pp. 1278–1283. [9] A. Sampath, P. S. Kumar, and J. M. Holtzman, “On setting reverse link target sir in a cdma system,” in Vehicular Technology Conference, 1997, IEEE 47th, vol. 2. IEEE, 1997, pp. 929–933. [10] M. G. Sarret, D. Catania, F. Frederiksen, A. F. Cattoni, G. Berardinelli, and P. Mogensen, “Dynamic outer loop link adaptation for the 5g centimeter-wave concept,” in European Wireless 2015; 21th European Wireless Conference;Proceedings of. VDE, 2015, pp. 1–6. [11] V. Buenestado, J. M. Ruiz-Aviles, M. Toril, S. Luna-Ram´ırez, and A. Mendo,“Analysis of throughput performance statistics for benchmarking lte networks,” IEEE Communications Letters, vol. 18, no. 9, pp.1607–1610, 2014. [12] F. Blanquez-Casado, G. Gomez, M. del Carmen Aguayo-Torres, and J. T. Entrambasaguas, “eolla: an enhanced outer loop link adaptation for cellular networks,” EURASIP Journal on Wireless Communications and Networking,vol. 2016, no. 1, p. 20, 2016. [13] S. Park, R. C. Daniels, and R. W. Heath, “Optimizing the target error rate for link adaptation,” in Global Communications Conference (GLOBECOM),2015 IEEE. IEEE, 2015, pp. 1–6. [14] R. A. Delgado, K. Lau, R. Middleton, R. S. Karlsson, T. Wigren, and Y. Sun,“Fast convergence outer loop link adaptation with infrequent updates in steady state,” in Vehicular Technology Conference (VTC-Fall), 2017 IEEE 86th. IEEE, 2017, pp. 1–5. [15] T. Ohseki and Y. Suegara, “Fast outer-loop link adaptation scheme realizing low-latency transmission in lte-advanced and future wireless networks,” in Radio and Wireless Symposium (RWS), 2016 IEEE. IEEE, 2016, pp. 1–3. [16] R. Daniels and R. W. Heath, “Online adaptive modulation and coding with support vector machines,” in Wireless Conference (EW), 2010 European.IEEE, 2010, pp. 718–724. [17] R. C. Daniels, C. M. Caramanis, and R. W. Heath, “Adaptation in convolutionally coded mimo-ofdm wireless systems through supervised learning and snr ordering,” IEEE Transactions on vehicular Technology, vol. 59, no. 1, pp.114–126, 2010. [18] R. Bruno, A. Masaracchia, and A. Passarella, “Robust adaptive modulationand coding (amc) selection in lte systems using reinforcement learning,” pp.1–6, 2014. [19] J. P. Leite, P. H. P. de Carvalho, and R. D. Vieira, “A ﬂexible framework based on reinforcement learning for adaptive modulation and coding in ofdm wireless systems,” in Wireless Communications and Networking Conference(WCNC), 2012 IEEE. IEEE, 2012, pp. 809–814. [20] S. Yun and C. Caramanis, “Reinforcement learning for link adaptationin mimo-ofdm wireless systems,” in Global Telecommunications Conference(GLOBECOM 2010), 2010 IEEE. IEEE, 2010, pp. 1–5. [21] M. Rupp, S. Schwarz, and M. Taranetz, The Vienna LTE-Advanced Simulators: Up and Downlink, Link and System Level Simulation, 1st ed., ser.Signals and Communication Technology. Springer Singapore,2016. [22] X. Xia, S. Fang, G. Wu, and S. Li, “Joint user pairing and precoding in mu-mimo broadcast channel with limited feedback,” IEEE Communications Letters, vol. 14, no. 11, pp. 1032–1034, 2010. [23] L. Chen, Z. Chen, L. Liu, and B. Fu, “Successive precoding and user selection in mu-mimo broadcast channel with limited feedback,” in Wireless Telecommunications Symposium (WTS), 2014. IEEE, 2014, pp. 1–5. [24] L. Yin, W. O. Popoola, X. Wu, and H. Haas, “Performance evaluation of non-orthogonal multiple access in visible light communication,” IEEE Transactions on Communications, vol. 64, no. 12, pp. 5162–5175, 2016. [25] Z. Shen, R. Chen, J. G. Andrews, R. W. Heath, and B. L.Evans, “Low complexity user selection algorithms for multiuser mimo systems with block diagonalization,” IEEE Transactions on Signal Processing,vol. 54, no. 9, pp.3658–3663, 2006. [26] S. Tomida and K. Higuchi, “Non-orthogonal access with sic in cellular downlink for user fairness enhancement,” in Intelligent Signal Processing and Communications Systems (ISPACS), 2011 International Symposium on. IEEE,2011, pp. 1–6. [27] R. S. Sutton and A. G. Barto, Reinforcement learning: An introduction. MIT press Cambridge, 1998, vol. 1, no. 1. [28] D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. Van Den Driessche,J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M.Lanctot, et al., “Mastering the game of go with deep neural networks and tree search,” Nature,vol. 529, no. 7587, pp. 484–489, 2016. [29] K. I. Pedersen, G. Monghal, I. Z. Kovacs, T. E. Kolding, A. Pokhariyal, F. Frederiksen, and P. Mogensen, “Frequency domain scheduling for ofdma with limited and noisy channel feedback,” in Vehicular Technology Conference, 2007. VTC-2007 Fall. 2007 IEEE 66th. IEEE, 2007, pp. 1792–1796. [30] A. Duran, M. Toril, F. Ruiz, and A. Mendo, “Self-optimization algorithm for outer loop link adaptation in lte,” IEEE Communications Letters, vol. 19, no. 11, pp. 2005–2008, 2015. [31] M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S.Corrado, A. Davis, J. Dean, M. Devin, S. Ghemawat, I.Goodfellow,A. Harp, G. Irving, M. Isard, Y. Jia, R. Jozefowicz, L. Kaiser,M. Kudlur, J. Levenberg, D. Man´e, R. Monga, S. Moore, D. Murray,C. Olah, M. Schuster, J. Shlens, B. Steiner, I. Sutskever, K. Talwar,P. Tucker, V. Vanhoucke, V. Vasudevan, F. Vi´egas, O. Vinyals, P. Warden,M. Wattenberg, M. Wicke, Y. Yu, and X. Zheng, “TensorFlow: Large-scale machine learning on heterogeneous systems,” 2015, software available from tensorﬂow.org. Online Available at: http://tensorﬂow.org/ [32] F. Chollet et al., “Keras,” https://github.co/fchollet/keras, 2015. [33] T.-Y. Ho, P.-M. Lam, and C.-S. Leung, “Parallelization of cellular neural networks on gpu,” Pattern Recognition, vol. 41, no. 8, pp. 2684–2692, 2008. [34] S. Han, X. Liu, H. Mao, J. Pu, A. Pedram, M. A. Horowitz, and W. J.Dally, “Eie: eﬃcient inference engine on compressed deep neural network,'in Computer Architecture (ISCA), 2016 ACM/IEEE 43rd Annual International Symposium on. IEEE, 2016, pp. 243–254. [35] M. Shaﬁque, T. Theocharides, C.-S. Bouganis, M. A. Hanif, F. Khalid,R. Hafız, and S. Rehman, “An overview of next-generation architectures for machine learning: Roadmap, opportunities and challenges in the iot era,”in Design, Automation & Test in Europe Conference & Exhibition (DATE),2018. IEEE, 2018, pp. 827–832. [36] V. Sze, Y.-H. Chen, J. Emer, A. Suleiman, and Z. Zhang, “Hardware for machine learning: Challenges and opportunities,” in 2017 IEEE Custom Integrated Circuits Conference (CICC). IEEE, 2017, pp. 1–8. [37] V. Mnih, A. P. Badia, M. Mirza, A. Graves, T. Lillicrap, T. Harley, D. Silver, and K. Kavukcuoglu, “Asynchronous methods for deep reinforcement learning,” in International Conference on Machine Learning, 2016, pp. 1928–1937.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/handle/123456789/819	-
dc.description.abstract	有許多的研究專注於結合多天線多使用者系統（MU-MIMO）和非正交多工接取（NOMA）這兩項技術去提升流量，但這些研究大部分並沒有考慮在實際LTE/LTE-A環境下，通道資訊回饋是有限的。而根據我們的分析，有限通道資訊回饋造成的量化誤差會導致不合適的資源配置，因此，在結合這兩項技術時，必須考慮如何獲得準確信號與干擾雜訊比（SINR）的方法。為了避免改變目前LTE-A的通道回饋規範，本論文採用Outer Loop Link Adaption（OLLA），利用混合動態回傳（HARQ）來動態調整估SINR。對於LTE-A連結時間短的連線，OLLA的收斂速度會是一個重要的議題。針對該議題，現行的OLLA多只考慮通道品質指標（CQI），但在多天線系統中，預編碼索引（PMI）以及不同排程組合的干擾也須列入考量。在本論文中，我們使用強化學習去改良OLLA。強化學習可以自動與環境互動，觀察出各種不被現有知識限制的策略。但當將強化學習應用至一個新的領域時，對於該問題的了解會是能否有效訓練的關鍵，因此，我們分析了何種因素會影響OLLA的策略。基於該分析，我們考慮了排程過的使用者資訊、通道回饋、偏好的調制與編碼策略和一起排程的使用者以設計合適的特徵擷取、獎賞設計（reward shaping）和探索（exploration and exploitation）機制，並對訓練相關參數進行各種嘗試，以提供更有效率的訓練架構。與OLLA的基線相比，我們提出的方法在結合NOMA+MU-MIMO時有7%的增益，在MU-MIMO有14%的增益。此外，OLLA收斂速度增快了38%。總而言之，我們提出一個能自動改良OLLA的架構，此架構能有效針對不同回饋處理干擾並改善流量和收斂速度。	zh_TW
dc.description.abstract	Much work has been done to improve the overall throughput by jointly considering MU-MIMO and NOMA, but little has considered the combination of these techniques under practical environments in LTE/LTE-A, in which the feedback of CSI is limited. Based on our analysis, a method capable of obtaining accurate SINR is important to reduce the improper resource allocation caused by such limited CSI feedback. In this work, to avoid changing the current feedback architecture in LTE-A, we adopt outer loop link adaption (OLLA) to dynamically modify the MCS according to HARQ. Convergence plays a crucial role while applying OLLA in LTE-A due to the characteristics of short connections. For the convergence issue, only CQI is considered in most existing OLLA, while PMI and pairing should be taken into account in MU-MIMO. In this work, we adopt the reinforcement learning to enhance OLLA. Reinforcement learning is a technique which can explore unknown strategies through interacting with the environment. When applying reinforcement learning in a new field, domain knowledge is important for effective training. Therefore, the factors affecting the strategy of OLLA, including the past assigned MCS of the scheduled users, feedback, desired MCS, and pairing user, are analyzed for state design, reward shaping, and exploration and exploitation. Our proposed method improves the throughput by 7% in NOMA+MU-MIMO and by 14% in MU-MIMO. Moreover, the convergence speed is increased by 38%. To conclude, we propose an architecture that can enhance OLLA automatically, deal with the interference, improve the throughput, and accelerate the convergence under different types of feedback.	en
dc.description.provenance	Made available in DSpace on 2021-05-11T05:07:48Z (GMT). No. of bitstreams: 1 ntu-108-R04942076-1.pdf: 9659460 bytes, checksum: dde6963c9d7ca4b3d0a025771caa1be0 (MD5) Previous issue date: 2019	en
dc.description.tableofcontents	ABSTRACT ii LIST OF TABLES v LIST OF FIGURES vi CHAPTER 1 INTRODUCTION 1 CHAPTER 2 BACKGROUND AND RELATED WORK 5 2.1 Background 5 2.1.1 MU-MIMO Overview 5 2.1.2 NOMA Overview 7 2.1.3 Reinforcement Learning 9 2.2 Related Work 11 2.2.1 NOMA+MU-MIMO 11 2.2.2 Outer Loop Link Adaption 12 2.2.3 Machine Learning based Link Adaption 14 CHAPTER 3 SCENARIO AND PROBLEM FORMULATIONS15 3.1 Network Model of NOMA+MU-MIMO 15 3.2 Communication System in LTE/ LTE-A 18 3.3 Scenario in LTE/LTE-A 20 3.4 Problems Formulation 21 3.4.1 Observation of NOMA+MUMIMO and MUMIMO 21 3.4.2 The Impact of CSI on SINR 22 3.4.3 Convergence Formulation 25 3.5 Analysis of the Convergence Formulation 25 3.5.1 Analysis of the SINR in MU-MIMO 26 3.5.2 Observation of SINR in different CQI and PMI 28 CHAPTER 4 PROPOSED REINFORCEMENT LEARNING BASED LINK ADAPTION 32 4.1 Motivation of Reinforcement Learning 32 4.2 Train an OLLA Agent based on Reinforcement Learning. . . . . . 35iii 4.3 Proposed Mechanism in Communication System 37 4.3.1 The Design of State, Reward, and Neural Network 37 4.4 Reinforcement Learning Algorithm 46 4.4.1 Asynchronous Advantage Actor-Critic Agents(A3C) 46 4.4.2 Implementation and Modification of A3C 47 4.5 Proposed Feedback and Scheduler 51 CHAPTER 5 PERFORMANCE EVALUATION 59 5.1 Scenario Setting 59 5.2 Simulation Results 60 5.2.1 Verify the Design of Reinforcement Learning 61 5.2.2 Performance in VIENNA 69 CHAPTER 6 CONCLUSION AND FUTURE WORK 82 REFERENCES 83
dc.language.iso	zh-TW
dc.subject	有限回授	zh_TW
dc.subject	非正交多工接取	zh_TW
dc.subject	多天線多使用者系統	zh_TW
dc.subject	強化學習	zh_TW
dc.subject	可適應性連結	zh_TW
dc.subject	NOMA	en
dc.subject	limited feedback	en
dc.subject	reinforcement learning	en
dc.subject	MU-MIMO	en
dc.subject	link adaption	en
dc.title	有限回授之多天線非正交多工接取系統下使用強化學習選擇調變編碼模式	zh_TW
dc.title	On Using Reinforcement Learning to Select Modulation/Coding Schemes for Non-Orthogonal Multiple Access in Multi-User Multiple-Input Multiple-Output Systems with Limited Feedback	en
dc.date.schoolyear	107-1
dc.description.degree	碩士
dc.contributor.oralexamcommittee	林宗男,蘇炫榮
dc.subject.keyword	多天線多使用者系統,非正交多工接取,有限回授,強化學習,可適應性連結,	zh_TW
dc.subject.keyword	MU-MIMO,NOMA,link adaption,reinforcement learning,limited feedback,	en
dc.relation.page	85
dc.identifier.doi	10.6342/NTU201900453
dc.rights.note	同意授權(全球公開)
dc.date.accepted	2019-02-14
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	電信工程學研究所	zh_TW
Appears in Collections:	電信工程學研究所

Files in This Item:

File	Size	Format
ntu-108-1.pdf	9.43 MB	Adobe PDF	View/Open

Show simple item record

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets