Kubernetes 環境中節點退役流程對容器資源配置與排程策略之影響分析

劉威潔; Wei-Chieh Liu

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/98224

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	林忠緯	zh_TW
dc.contributor.advisor	Chung-Wei Lin	en
dc.contributor.author	劉威潔	zh_TW
dc.contributor.author	Wei-Chieh Liu	en
dc.date.accessioned	2025-07-30T16:24:08Z	-
dc.date.available	2025-07-31	-
dc.date.copyright	2025-07-30	-
dc.date.issued	2025	-
dc.date.submitted	2025-07-25	-
dc.identifier.citation	[1] D. Bernstein, “Containers and cloud: From LXC to Docker to Kubernetes”, IEEE cloud computing, vol. 1, no. 3, pp. 81–84, 2014. [2] B. Burns, J. Beda, K. Hightower, and L. Evenson, Kubernetes: up and running: dive into the future of infrastructure. O’Reilly Media, Inc., 2022. [3] B. Burns, B. Grant, D. Oppenheimer, E. Brewer, and J. Wilkes, “Borg, omega, and kubernetes,” Communications of the ACM, vol. 59, no. 5, pp. 50–57, 2016. [4] Z. Cai and R. Buyya, “Inverse queuing model-based feedback control for elastic container provisioning of web systems in kubernetes,” IEEE Transactions on Computers, vol. 71, no. 2, pp. 337–348, 2021. [5] A. Chung, J. W. Park, and G. R. Ganger, “Stratus: Cost-aware container scheduling in the public cloud,” in Proceedings of the ACM symposium on cloud computing, pp. 121–134, 2018. [6] J. Hartmanis, “Computers and intractability: a guide to the theory of NP-completeness (michael r. garey and david s. johnson),” Siam Review, vol. 24, no. 1, p. 90, 1982. [7] T. Li, L. Qiu, F. Chen, H. Chen, and N. Zhou, “Carokrs: Cost-aware resource optimization kubernetes resource scheduler,” in 2024 9th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 127–133. IEEE, 2024. [8] M. Lin, J. Xi, W. Bai, and J. Wu, “Ant colony algorithm for multi-objective optimization of container-based microservice scheduling in cloud,” IEEE Access, vol. 7, pp. 83 088–83 100, 2019. [9] W. Lin, D. Qi et al., “Review of cloud computing resource scheduling,” Comput Sci, vol. 39, no. 10, pp. 1–6, 2012. [10] B. Liu, J. Li, W. Lin, W. Bai, P. Li, and Q. Gao, “K-PSO: An improved PSO-based container scheduling algorithm for big data applications,” International Journal of Network Management, vol. 31, no. 2, p.e2092, 2021. [11] D. Merkel et al., “Docker: lightweight linux containers for consistent development and deployment,” Linux J., vol. 239, no. 2, p. 2, 2014. [12] J. Santos, T. Wauters, B. Volckaert, and F. De Turck, “Towards network-aware resource provisioning in kubernetes for fog computing applications,” in 2019 IEEE Conference on Network Softwarization (NetSoft), pp. 351–359. IEEE, 2019. [13] K. Senjab, S. Abbas, N. Ahmed, and A. u. R. Khan, “A survey of kubernetes scheduling algorithms,” Journal of Cloud Computing, vol. 12, no. 1, p. 87, 2023. [14] T. Wang, S. Ferlin, and M. Chiesa, “Predicting cpu usage for proactive autoscaling,” in Proceedings of the 1st Workshop on Machine Learning and Systems, pp. 31–38, 2021. [15] L. Wojciechowski, K. Opasiak, J. Latusek, M. Wereski, V. Morales, T. Kim, and M. Hong, “Netmarks: Network metrics-aware kubernetes scheduler powered by service mesh,” in IEEE INFOCOM 2021-IEEE Conference on Computer Communications, pp. 1–9. IEEE, 2021. [16] Q. Zhang, L. Cheng, and R. Boutaba, “Cloud computing: state-of-the-art and research challenges,” Journal of internet services and applications, vol. 1, pp.7–18, 2010.	-
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/98224	-
dc.description.abstract	隨著雲端運算與微服務架構的普及，Kubernetes 已成為主流的容器編排平台。然而其預設調度策略未能充分考量節點硬體世代差異與工作負載特性，導致生產環境中常見資源配置效率低落與基礎設施成本偏高等問題。為因應此一挑戰，本文以某企業部署於 AWS EKS 的一年運行快照為基礎，設計一組具實務代表性的模擬環境，並提出兩階段調度策略：「Guaranteed Spread & New-Node Priority Scheduling（GSNPS）」與「CPU-Resource Aligned Scheduling（CRAS）」。GSNPS 於初始 Pod 排程階段納入節點硬體世代與區域分佈考量，以提升部署的更新效益與容錯能力；CRAS 則於後處理階段應用模擬退火演算法（Simulated Annealing, SA），將 CPU 密集型工作負載重新配置至計算優化節點，以提升資源對位效率。根據 24 小時模擬結果，本策略能有效促進老舊節點的退役，並提升整體資源使用的成本效益，顯示其在異質化 Kubernetes 叢集下具備實用潛力。	zh_TW
dc.description.abstract	With the rise of cloud computing and microservice architectures, Kubernetes has become the standard platform for container orchestration. However, its default scheduler often overlooks factors such as hardware generation and workload-specific resource profiles, leading to suboptimal resource utilization and elevated infrastructure costs in long-running production clusters. This thesis presents a case study based on an anonymized one-year snapshot from an enterprise-scale AWS Elastic Kubernetes Service (EKS) deployment. Two scheduling strategies are proposed: Guaranteed Spread and New-Node Priority Scheduling (GSNPS), which prioritizes zone-level fault tolerance and favors newer, cost-efficient instances during pod placement; and CPU-Resource Aligned Scheduling (CRAS), a post-processing refinement that applies Simulated Annealing (SA) to reallocate CPU-intensive workloads to compute-optimized nodes. A 24-hour simulation demonstrates that the combined GSNPS and CRAS strategy significantly improves infrastructure efficiency. GSNPS facilitates the retirement of underutilized legacy nodes, while CRAS enhances resource alignment by matching workload profiles to node specializations. The results validate the effectiveness of this two-stage scheduling framework in promoting cost-efficient node usage and sustainable cluster operation in heterogeneous Kubernetes environments.	en
dc.description.provenance	Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2025-07-30T16:24:08Z No. of bitstreams: 0	en
dc.description.provenance	Made available in DSpace on 2025-07-30T16:24:08Z (GMT). No. of bitstreams: 0	en
dc.description.tableofcontents	Acknowledgements ii 摘要 iii Abstract iv Table of Contents vi List of Figures ix Chapter 1. Introduction 1 1.1 Background and Context 1 1.2 Related Work 4 1.3 Motivation 5 1.4 Objectives and Scope of the Study 7 1.5 Contributions 8 1.6 Thesis Organisation 9 Chapter 2. Problem Formulation 10 2.1 Overview 10 2.2 Assumptions 11 2.2.1 Notation and Definitions 11 2.2.2 Cluster Model 17 2.2.3 Workload Model 18 2.3 Abstraction 20 2.3.1 Deployment Graph and Priority Edges 21 2.3.2 Decision Variables 21 2.3.3 Derived Quantities 21 2.3.4 Objective Functions 22 2.3.5 Constraints 24 2.4 Complexity Analysis 26 Chapter 3. Algorithms 28 3.1 Overview 28 3.2 Guaranteed Spread and New-Node Priority Scheduling (GSNPS) 28 3.2.1 Scoring Function Definition 29 3.2.2 Node Selection Strategy 30 3.2.3 Implementation and Complexity 30 3.3 CPU-Resource Aligned Scheduling (CRAS) 30 3.3.1 Hardware Class and Workload Characterization 31 3.3.2 Refinement via SA 32 3.3.3 Benefits of Alignment 33 Chapter 4. Experiments 35 4.1 Overview 35 4.2 Baseline Results (Default Kubernetes Scheduler) 37 4.2.1 Scoring Mechanism 37 4.2.2 Observed Scheduling Outcomes 38 4.2.3 Deployment Redundancy and Spread 38 4.2.4 Node Retirement and Cost Baseline 39 4.2.5 Cross-Zone Communication Overhead 39 4.3 Results and Comparative Analysis (GSNPS vs. Default) 40 4.3.1 Zone-Level Replica Distribution 40 4.3.2 Node Generation Preference 41 4.3.3 Resource Utilization Over Time 41 4.3.4 Facilitation of Node Retirement 42 4.4 Impact of CRAS Optimization 43 4.4.1 Runtime Overhead of CRAS 43 4.4.2 Reduction in Alignment Penalty 44 4.4.3 Improvement in Resource Utilization 44 4.4.4 Enhancement of Cost-Efficient Resource Utilization 45 4.4.5 Summary of Optimization Effects 46 4.5 Summary of Experimental Findings 47 Chapter 5. Discussion 56 5.1 Overview 56 5.2 Interpretation of Key Findings 56 5.3 Trade-offs and Systemic Impact 57 5.4 Scalability and Deployment Considerations 57 5.5 Limitations 58 5.6 Practical Implications 59 5.7 Future Work 59 5.8 Summary 60 Chapter 6. Conclusion 61 Bibliography 63 Appendix 66 Appendix A. Full Deployment Resource Configuration 67 Appendix B. Deployment Rollout Summary 71 Appendix C. Deployment Rollout Days 75	-
dc.language.iso	en	-
dc.subject	Kubernetes 調度	zh_TW
dc.subject	節點汰除	zh_TW
dc.subject	模擬退火（SA）	zh_TW
dc.subject	微服務部署	zh_TW
dc.subject	資源對位	zh_TW
dc.subject	雲端資源最佳化	zh_TW
dc.subject	容器編排	zh_TW
dc.subject	node retirement	en
dc.subject	container orchestration	en
dc.subject	cloud cost optimization	en
dc.subject	resource alignment	en
dc.subject	workload placement	en
dc.subject	simulated annealing	en
dc.subject	Kubernetes scheduling	en
dc.title	Kubernetes 環境中節點退役流程對容器資源配置與排程策略之影響分析	zh_TW
dc.title	A Case Study on Node Retirement Procedures for Pod Allocation and Scheduling in Kubernetes	en
dc.type	Thesis	-
dc.date.schoolyear	113-2	-
dc.description.degree	碩士	-
dc.contributor.oralexamcommittee	江蕙如;黎士瑋;黃上恩	zh_TW
dc.contributor.oralexamcommittee	Hui-Ru Jiang;Shih-Wei Li;Shang-En Huang	en
dc.subject.keyword	Kubernetes 調度,節點汰除,模擬退火（SA）,微服務部署,資源對位,雲端資源最佳化,容器編排,	zh_TW
dc.subject.keyword	Kubernetes scheduling,node retirement,simulated annealing,workload placement,resource alignment,cloud cost optimization,container orchestration,	en
dc.relation.page	79	-
dc.identifier.doi	10.6342/NTU202502450	-
dc.rights.note	未授權	-
dc.date.accepted	2025-07-29	-
dc.contributor.author-college	電機資訊學院	-
dc.contributor.author-dept	資訊工程學系	-
dc.date.embargo-lift	N/A	-
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
ntu-113-2.pdf 未授權公開取用	1.57 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。