一個基於馬可夫決策過程之異質網路接取策略

Yu-Feng Zheng; 鄭宇峰

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/21128

標題:	一個基於馬可夫決策過程之異質網路接取策略 An MDP-based Selection Policy for Heterogeneous Network
作者:	Yu-Feng Zheng 鄭宇峰
指導教授:	鐘嘉德
共同指導教授:	林風
關鍵字:	網路選擇策略,半馬可夫決策過程,多種使用者種類,Q-學習, Network Selection Policy,SMDP,Multi-User Type,Q-learning,
出版年 :	2019
學位:	碩士
摘要:	異質網路提供行動通訊裝置多種無線接取技術。在這種具有多種無線接取技術可採用的環境裡，接取網路的選擇變的很重要。為此我們提出一個新型的網路選擇接取策略，此策略稱為多種使用者種類的最大化獎勵策略(MU-MRP)。此策略目的是為了在合理的價格中提升無線通訊服務的品質, 降低換手次數與換手失敗次數。MU-MRP 為基於半馬可夫決策過程的策略，此數學模型所的到的策略確保此策略可以達成最大的整體獎勵。我們使用Q 學習演算法找出最優策略，許多模擬結果顯示MU-MRP 比其他策略達到更大的獎勵，此外，MU-MRP 可以大幅減低換手次數與換手失敗次數。 Heterogeneous Networks provide user equipments (UEs) diverse radio access technologies (RATs). Under this environment with RATs, network selection goes important. We propose a novel network selection policy (NSP) so-called Multi-User-Typed Maximum Reward Policy (MU-MRP). The incentive is to enhance the quality of service (QoS), reduce the number of handoffs and dropping events in consideration of a reasonable monetary cost. MU-MRP is based on semi Markov decision process (SMDP), which offers an optimal policy to maximize the overall rewards. The Q-learning algorithm is used to determine the optimal policy. Numerical results show that MU-MRP earns more total system rewards than other policies. Also, MU-MRP reduces the number of handoffs and dropping events obviously.
URI:	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/21128
DOI:	10.6342/NTU201904423
全文授權:	未授權
顯示於系所單位：	電信工程學研究所

文件中的檔案：

檔案	大小	格式
ntu-108-1.pdf 目前未授權公開取用	8.73 MB	Adobe PDF

顯示文件完整紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料（如：文字、圖片、PDF）並使其易於取用。