以深度強化學習進行公平考量之行動邊緣運算資源分配最佳化

張安浩; An-How Chang

Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/88997

Title:	以深度強化學習進行公平考量之行動邊緣運算資源分配最佳化 Fairness-Aware Deep Reinforcement Learning for Task Offloading and Resource Allocation in Mobile Edge Computing
Authors:	張安浩 An-How Chang
Advisor:	蔡志宏 Zsehong Tsai
Keyword:	計算分載最佳化,雲端資源分配,決策演算法,強化式學習,使用者平等, Computation Offloading,Resource Allocation,Reinforcement Learning,Fairness,
Publication Year :	2023
Degree:	碩士
Abstract:	隨著日益進步的使用者產品和使用者需求，手機、平板、手錶等行動裝置使用者(mobile user)逐漸要求越來越大量的運算需求已完成更高品質的成果，產品本身的硬體運算資源便可能不足以應付低容忍度完成時間需求。為了突破邊緣的硬體限制，行動邊緣運算(Mobile Edge Computing) 逐漸獲得重視，藉著鄰近的邊緣運算的分擔邊緣裝置(Edge Device)達成運算需求。本研究建構一個系統包含多個MEC伺服器並模擬多個使用者在系統中提出計算需求的環境，並提出以深度強化學習(Deep Reinforcement Learning)為系統架構的中央決策分配模型。藉由優化最大數量的工作數量的目標，並輔以公平分配指標檢視分配抉擇，以達到公平分配的最佳優化。此研究更進一步擴展實驗，藉由測試存在多種資料大小迥異的工作環境下，測試出模型期依然保持良好且公平分配決策，進一步驗證決策模型的可性度。此外，我們測試數種加權獎勵函數(Weighted Reward Function)並觀察模型不同設定下的表現。總結來說，實驗顯示提出之深度強化學習模型能夠在資料大小迥異的環境下提供公平的資源分配。 As the requirement for computation grows exponentially in quantity and quality, mobile-edge computing (MEC) has perceived more attention to become a computational mechanism for near mobile users. With the additional computing power from nearby edge servers, edge users can potentially handle tasks that need to meet massive and real-time computation requirements. In this thesis, we consider a multi-MEC systems with multiple users, each users generate computational tasks follow by Poisson process. The thesis propose a deep reinforcement learning-based allocation model to allocate tasks gathering from all users in a centralized decision algorithm. We formulate the optimization goal as maximizing the number tasks finished within deadline while task fairness awareness is taken into considerations. In order to handle continuous task flows, we formulate a multi-round allocation scheme and simulate its operation in the real world complex system. For validation purposes, this thesis made several simulations on huge task size difference network to test model's performance and comparing to random allocation. In addition, we tested a various type of award functions, including a basic reward function and a set of weighted reward function, to observe their performance under different setups. In conclusion, the simulation results show that DRL model is effective in providing for task allocations for various task-size distributions.
URI:	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/88997
DOI:	10.6342/NTU202302089
Fulltext Rights:	同意授權(全球公開)
Appears in Collections:	電信工程學研究所

Files in This Item:

File	Size	Format
ntu-111-2.pdf	1.69 MB	Adobe PDF	View/Open

Show full item record

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets