一個基於馬可夫決策過程之異質網路接取策略

Yu-Feng Zheng; 鄭宇峰

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/21128

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	鐘嘉德
dc.contributor.author	Yu-Feng Zheng	en
dc.contributor.author	鄭宇峰	zh_TW
dc.date.accessioned	2021-06-08T03:27:27Z	-
dc.date.copyright	2020-07-20
dc.date.issued	2019
dc.date.submitted	2019-12-25
dc.identifier.citation	[1] Cisco, “Cisco visual networking index: Forecast and trends, 2017–2022,” White Paper, Nov 2018. [2] 3GPP TS 24.234 v. 12.2.0, “3GPP System to Wireless Local Area Network (WLAN) interworking; WLAN User Equipment (WLAN UE) to network protocols; Stage 3,” Mar 2015. [3] Alliance, Wi-Fi and Passpoint, Wi-Fi Certified, “Hotspot 2.0 (release 2) technical specification.” [4] 3GPP TS 23.402 v. 15.3.0, “Architecture Enhancements for Non-3GPP Accesses,”Mar 2018. [5] 3GPP TS 24.312 v. 15.0.0, “Access network discovery and selection function (ANDSF) management object (mo),” Jun 2018. [6] D. Senthilkumar and A. Krishnan, “Throughput analysis of IEEE 802.11 multirate WLANs with collision aware rate adaptation algorithm,” International Journal of Automation and Computing, vol. 7, no. 4, pp. 571–577, 2010. [7] Y. He, M. Chen, B. Ge, and M. Guizani, “On wifi offloading in heterogeneous networks: Various incentives and trade-off strategies,” IEEE Communications Surveys & Tutorials, vol. 18, no. 4, pp. 2345–2385, 2016. [8] E. M. Mohamed, K. Sakaguchi, and S. Sampei, “Delayed offloading zone associations using cloud cooperated heterogeneous networks,” in 2015 IEEE Wireless Communications and Networking Conference Workshops (WCNCW), pp. 374–379, IEEE, 2015. [9] W. Zhang, “Handover decision using fuzzy MADM in heterogeneous networks,”2004 IEEE Wireless Communications and Networking Conference (IEEE Cat. No.04TH8733), vol. 2, pp. 653–658, 2004. [10] F. Bari and V. C. Leung, “Automated network selection in a heterogeneous wireless network environment,” IEEE Network, vol. 21, no. 1, pp. 34–40, 2007. [11] E. Stevens-Navarro, Y. Lin, and V. W. Wong, “An MDP-based vertical handoff decision algorithm for heterogeneous wireless networks,” IEEE Transactions on Vehicular Technology, vol. 57, no. 2, pp. 1243–1254, 2008. [12] S. Zang, W. Bao, P. L. Yeoh, H. Chen, Z. Lin, B. Vucetic, and Y. Li, “Mobility handover optimization in millimeter wave heterogeneous networks,” 2017 17th International Symposium on Communications and Information Technologies (ISCIT), pp. 1–6, 2017. [13] X. Chen, H. Wang, X. Xiang, and C. Gao, “Joint handover decision and channel allocation for LTE-a femtocell networks,” The 2014 5th International Conference on Game Theory for Networks, pp. 1–5, 2014. [14] C. Sun, E. Stevens-Navarro, and V. W. Wong, “A constrained MDP-based vertical handoff decision algorithm for 4g wireless networks,” 2008 IEEE International Conference on Communications, pp. 2169–2174, 2008. [15] Q. Song and A. Jamalipour, “A quality of service negotiation-based vertical handoff decision scheme in heterogeneous wireless systems,” European Journal of Operational Research, vol. 191, no. 3, pp. 1059–1074, 2008. [16] S. Zang, W. Bao, P. L. Yeoh, B. Vucetic, and Y. Li, “Managing vertical handovers in millimeter wave heterogeneous networks,” IEEE Transactions on Communications, vol. 67, no. 2, pp. 1629–1644, 2018. [17] R. S. Sutton, A. G. Barto, et al., Introduction to reinforcement learning, vol. 2. MIT press Cambridge, 2017.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/21128	-
dc.description.abstract	異質網路提供行動通訊裝置多種無線接取技術。在這種具有多種無線接取技術可採用的環境裡，接取網路的選擇變的很重要。為此我們提出一個新型的網路選擇接取策略，此策略稱為多種使用者種類的最大化獎勵策略(MU-MRP)。此策略目的是為了在合理的價格中提升無線通訊服務的品質, 降低換手次數與換手失敗次數。MU-MRP 為基於半馬可夫決策過程的策略，此數學模型所的到的策略確保此策略可以達成最大的整體獎勵。我們使用Q 學習演算法找出最優策略，許多模擬結果顯示MU-MRP 比其他策略達到更大的獎勵，此外，MU-MRP 可以大幅減低換手次數與換手失敗次數。	zh_TW
dc.description.abstract	Heterogeneous Networks provide user equipments (UEs) diverse radio access technologies (RATs). Under this environment with RATs, network selection goes important. We propose a novel network selection policy (NSP) so-called Multi-User-Typed Maximum Reward Policy (MU-MRP). The incentive is to enhance the quality of service (QoS), reduce the number of handoffs and dropping events in consideration of a reasonable monetary cost. MU-MRP is based on semi Markov decision process (SMDP), which offers an optimal policy to maximize the overall rewards. The Q-learning algorithm is used to determine the optimal policy. Numerical results show that MU-MRP earns more total system rewards than other policies. Also, MU-MRP reduces the number of handoffs and dropping events obviously.	en
dc.description.provenance	Made available in DSpace on 2021-06-08T03:27:27Z (GMT). No. of bitstreams: 1 ntu-108-R06942036-1.pdf: 8940220 bytes, checksum: 83d101e8d0582d546160600bc303f61e (MD5) Previous issue date: 2019	en
dc.description.tableofcontents	口試委員會審定書. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . i 誌謝. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii 摘要. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 Related Works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2 Our Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.1 System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.2 States, Events and Actions . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.3 Rewards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.4 Policy Formulation and Optimization . . . . . . . . . . . . . . . . . . . 14 3 Performance Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 3.1 Effect of !C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 3.2 Effect of !ch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 3.3 Effect of !cd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 3.4 Effect of !cb . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 A HetNet Environment Formulation . . . . . . . . . . . . . . . . . . . . . . . . 40 B Notations in Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
dc.language.iso	en
dc.subject	Q-學習	zh_TW
dc.subject	多種使用者種類	zh_TW
dc.subject	半馬可夫決策過程	zh_TW
dc.subject	網路選擇策略	zh_TW
dc.subject	Network Selection Policy	en
dc.subject	SMDP	en
dc.subject	Multi-User Type	en
dc.subject	Q-learning	en
dc.title	一個基於馬可夫決策過程之異質網路接取策略	zh_TW
dc.title	An MDP-based Selection Policy for Heterogeneous Network	en
dc.type	Thesis
dc.date.schoolyear	108-2
dc.description.degree	碩士
dc.contributor.coadvisor	林風
dc.contributor.oralexamcommittee	林一平,廖婉君
dc.subject.keyword	網路選擇策略,半馬可夫決策過程,多種使用者種類,Q-學習,	zh_TW
dc.subject.keyword	Network Selection Policy,SMDP,Multi-User Type,Q-learning,	en
dc.relation.page	50
dc.identifier.doi	10.6342/NTU201904423
dc.rights.note	未授權
dc.date.accepted	2019-12-25
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	電信工程學研究所	zh_TW
顯示於系所單位：	電信工程學研究所

文件中的檔案：

檔案	大小	格式
ntu-108-1.pdf 未授權公開取用	8.73 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。