一個具有成本效益的即時巨量資料處理系統

Linjiun Tsai; 蔡林峻

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/51422

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	廖婉君
dc.contributor.author	Linjiun Tsai	en
dc.contributor.author	蔡林峻	zh_TW
dc.date.accessioned	2021-06-15T13:33:40Z	-
dc.date.available	2021-02-01
dc.date.copyright	2016-03-08
dc.date.issued	2016
dc.date.submitted	2016-02-01
dc.identifier.citation	[1] Apache Hadoop. Available: http://wiki.apache.org/hadoop [2] Apache Spark. Available: http://spark.apache.org/ [3] P. Barham et al., “Xen and the art of virtualization,” ACM SIGOPS Operating Systems Review, vol. 37, no. 5, pp. 164-177, 2003. [4] C. Clark et al., “Live migration of virtual machines,” in Proc. of the 2nd Conference on Symposium on Networked Systems Design & Implementation - Volume 2, 2005. [5] V. V. Vazirani, Approximation Algorithms, Springer Science & Business Media, 2002. [6] M. R. Garey and D. S. Johnson, Computers and intractability: a guide to the theory of NP-completeness, WH Freeman & Co., San Francisco, 1979. [7] G. Dósa, “The tight bound of first fit decreasing bin-packing algorithm is FFD(I)=(11/9)OPT(I)+6/9,” Combinatorics, Algorithms, Probabilistic and Experimental Methodologies, Springer Berlin Heidelberg, 2007. [8] B. Xia and Z. Tan, “Tighter bounds of the first fit algorithm for the bin-packing problem,” Discrete Applied Mathematics, vol. 158, no. 15, pp. 1668- 1675, 2010. [9] Q. He et al., “Case study for running HPC applications in public clouds,” in Proc. of the 19th ACM International Symposium on High Performance Distributed Computing, 2010. [10] S. Kandula et al., “The nature of data center traffic: measurements & analysis,” in Proc. of the 9th ACM SIGCOMM Conference on Internet Measurement Conference, 2009. [11] T. Ristenpart et al., “Hey, you, get off of my cloud: exploring information leakage in third-party compute clouds,” in Proc. of the 16th ACM Conference on Computer and Communications Security, 2009. [12] C. F. Lai et al., “A network and device aware QoS approach for cloud-based mobile streaming,” IEEE Trans. on Multimedia, vol. 15, no. 4, pp. 747-757, 2013. [13] X. Wang et al., “Cloud-assisted adaptive video streaming and social-aware video prefetching for mobile users,” IEEE Wireless Communications, vol. 20, no. 3, pp. 72-79, 2013. [14] R. Shea et al., “Cloud gaming: architecture and performance,” IEEE Network Magazine, vol. 27, no. 4, pp. 16-21, 2013. [15] S. K. Barker and P. Shenoy, “Empirical evaluation of latency-sensitive application performance in the cloud,” in Proc. of the first annual ACM Multimedia Systems, 2010. [16] J. Ekanayake et al., “MapReduce for data intensive scientific analyses,” in IEEE Fourth International Conference on eScience, 2008. [17] A. Iosup et al., “Performance analysis of cloud computing services for many-tasks scientific computing,” IEEE Trans. on Parallel and Distributed Systems, vol. 22, no. 6, pp. 931-945, 2011. [18] M. Zaharia et al., “Spark: cluster computing with working sets,” in Proc of the 2nd USENIX conference on Hot topics in cloud computing, 2010. [19] L. Breiman, “Random forests,” Machine learning, vol. 45, no.1, pp. 5-32, 2001. [20] F. Mao et al., “Influence of program inputs on the selection of garbage collectors,” in Proc. of the 2009 ACM SIGPLAN/SIGOPS international conference on Virtual execution environments, 2009. [21] M. R. Hines et al., “Applications know best: performance-driven memory overcommit with Ginkgo,” in IEEE Third International Conference on Cloud Computing Technology and Science, 2011. [22] J. S. Jeong et al., “Elastic memory: bring elasticity back to in-memory big data analytics,” in 15th Workshop on Hot Topics in Operating Systems, 2015. [23] S. Spinner et al., 'Proactive memory scaling of virtualized applications,” in IEEE 8th International Conference on Cloud Computing, 2015. [24] G. Shanmuganathan et al., “Towards proactive resource management in virtualized datacenters,” in Runtime Environments, Systems, Layering and Virtualized Environments, 2013. [25] C. O. Chen et al., “Machine learning-based configuration parameter tuning on Hadoop system,” in IEEE International Congress on Big Data, 2015. [26] M. Hertz and E. D. Berger, “Quantifying the performance of garbage collection vs. explicit memory management,” in Proc. of the 20th Annu. ACM SIGPLAN Conference on Object Oriented Programming Systems Languages and Applications, vol. 40, no. 10, pp. 313-326. 2005. [27] T. Jiang et al., “Understanding the behavior of in-memory computing workloads,” in IEEE International Symposium on Workload Characterization, 2014. [28] L. Gu and H. Li, “Memory or time: performance evaluation for iterative operation on Hadoop and Spark,” in IEEE 10th International Conference on High Performance Computing and Communications and IEEE International Conference on Embedded and Ubiquitous Computing, 2013. [29] J. Singer et al., “The economics of garbage collection,” ACM SIGPLAN Notices - ISMM '10, vol. 45, no. 8, pp. 103-112, 2010. [30] S. Lee and S. Sahu, “Efficient server consolidation considering intra-cluster traffic,” in IEEE Global Telecommunications Conference, 2011. [31] M. Wang et al., “Consolidating virtual machines with dynamic bandwidth demand in data centers,” in Proc. IEEE INFOCOM, 2011. [32] V. Mann et al., “VMFlow: leveraging VM mobility to reduce network power costs in data centers,” in Proc. of the 10th International IFIP TC 6 Conference on Networking - Volume Part I, 2011. [33] Y. Ho et al., “Server consolidation algorithms with bounded migration cost and performance guarantees in cloud computing,” in Fourth IEEE International Conference on Utility and Cloud Computing, 2011. [34] A. Murtazaev and S. Oh, “Sercon: server consolidation algorithm using live migration of virtual machines for green computing,” IETE Technical Review, vol. 28, no. 3, pp. 212-231, 2011. [35] S. Akoush et al., “Predicting the performance of virtual machine migration,” in IEEE International Symposium on Modeling, Analysis & Simulation of Computer and Telecommunication Systems, 2010. [36] L. Yang et al., “Real-time tasks oriented energy-aware scheduling in virtualized clouds,” IEEE Trans. on Cloud Computing, vol. 2, no. 2, pp. 168-180, 2014. [37] A. Shieh et al., “Sharing the data center network,” in Proc. of the 8th USENIX Conference on Networked Systems Design and Implementation, 2011. [38] H. Rodrigues et al., “Gatekeeper: supporting bandwidth guarantees for multi-tenant datacenter networks,” in Proc. of the 3rd Conference on I/O Virtualization, 2011. [39] H. Ballani et al., “Towards predictable datacenter networks,” ACM SIGCOMM Computer Communication Review, vol. 41, no. 4, pp. 242-253, 2011. [40] D. Xie et al., “The only constant is change: incorporating time-varying network reservations in data centers,” ACM SIGCOMM Computer Communication Review, vol. 42, no. 4, pp. 199-210, 2012. [41] L. Popa et al., “Elasticswitch: practical work-conserving bandwidth guarantees for cloud computing,” ACM SIGCOMM Computer Communication Review, vol. 43, no. 4, pp. 351-362, 2013. [42] H. Ballani et al., “Chatty tenants and the cloud network sharing problem,” in Proc. of the 10th USENIX Conference on Networked Systems Design and Implementation, 2013. [43] C. Guo et al., “Secondnet: a data center network virtualization architecture with bandwidth guarantees,” in Proc. of the 6th International COnference, 2010. [44] J. W. Jiang et al., “Joint VM placement and routing for data center traffic engineering,” in Proc. IEEE INFOCOM, 2012. [45] D. M. Divakaran et al., “An Online Integrated Resource Allocator for Guaranteed Performance in Data Centers.” IEEE Trans. on Parallel and Distributed Systems, vol. 25, no. 6, pp. 1382-1392, 2014. [46] L. Tsai and W. Liao, “Cost-aware workload consolidation in green cloud datacenter,” in IEEE 1st International Conference on Cloud Networking, 2012. [47] L. Tsai and W. Liao, “StarCube: an on-demand and cost-effective framework for cloud data center networks with performance guarantee,” IEEE Trans. on Cloud Computing, doi:10.1109/TCC.2015.2464818. [48] R. Available: http://www.r-project.org/ [49] LIBSVM Datasets. Available: https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/ [50] Stanford Large Network Dataset Collection. Available: https://snap.stanford.edu/data/ [51] M. Al-Fares et al., “A scalable, commodity data center network architecture,” ACM SIGCOMM Computer Communication Review, vol. 38, no. 4, pp. 63-74, 2008. [52] L. A. Wolsey and G. L. Nemhauser, Integer and combinatorial optimization, John Wiley & Sons, New York, 1988. [53] J. E. Hopcroft and R. M. Karp, “An n^5/2 algorithm for maximum matchings in bipartite graphs,” SIAM Journal on Computing, vol. 2, no. 4, pp. 225-231, 1973.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/51422	-
dc.description.abstract	The emerging Big Data paradigm has attracted attention from a wide variety of industry sectors, including healthcare, finance, retail, and manufacturing. To process massive heterogeneous data in a near real-time manner, Big Data applications should be run on dedicated server clusters that aggregate huge computing power, memory and storage through fast, unimpeded and reliable network infrastructures. Implementing such high-performance cluster computing is typically not economical for companies that only have occasional demand for Big Data processing. Cloud computing is considered a viable solution to reducing operating costs for Big Data applications due to its on-demand, pay-per-use and scalable nature. The shared nature of cloud data centers, however, may make application performance unpredictable. The strict network requirements and extremely large memory demands of Big Data clusters also lead to difficulties in optimizing the allocation of cloud resources. These difficulties translate into higher hosting cost per application. This dissertation proposes a solution to these problems that allows more concurrent Big Data applications to be deployed in cloud data centers in the most resource-efficient way while meeting their real-time requirements. To this end, we present 1) the first resource allocation framework that guarantees network performance for each Big Data cluster in multi-tenant clouds, 2) the first machine learning model that predicts the most efficient memory size for each Big Data cluster according to given upper bounds on performance penalties, and 3) an adaptive resource consolidation mechanism that strikes a balance between the number of required servers and the overhead of dynamic server consolidation for each cluster. The resource allocation framework takes advantage of the symmetry of the fat-tree network structure to enable data center networks to be efficiently partitioned into mutually exclusive and collectively exhaustive star networks, each allocated to a Big Data cluster. It provides several promising properties: 1) every cluster is isolated from other ones; 2) the topology for every cluster is non-blocking for arbitrary traffic pattern; 3) the number of links to form each cluster is the minimum; 4) the per-hop distance between any two servers in a cluster is equal; 5) the network topology allocated to each cluster is guaranteed logically unchanged during and after reallocation; 6) for fault tolerant allocation, the number of backup links connecting backup and active servers is the minimum; 7) the data center networks can be elastically trimmed and expanded while maintaining all the properties above. Based on the promising properties of this framework, a cost-bounded resource reallocation mechanism is also proposed, making nearly full use of cloud resources in polynomial time. The model for predicting the optimal memory size is designed to capture the memory management behaviors of Java virtual machines as well as the dynamic changes in memory consumption on distributed compute nodes. Through experiments on a physical Spark cluster containing 128 cores and 1 TB of memory, the model shows good prediction accuracy and saves a significant amount of memory space for operating Big Data applications that demand up to hundreds of gigabytes of working memory.	en
dc.description.provenance	Made available in DSpace on 2021-06-15T13:33:40Z (GMT). No. of bitstreams: 1 ntu-105-D97921014-1.pdf: 2435601 bytes, checksum: a78f49bab7f8c2d8eb5e86c20cde899c (MD5) Previous issue date: 2016	en
dc.description.tableofcontents	Chapter 1 Introduction 1 1.1 Big Data and Cloud Data Centers 1 1.2 Impact of Memory Size on Performance 2 1.3 Optimized Spot for Memory Allocation 8 1.4 Server Virtualization 11 1.5 Server Consolidation 12 1.6 Scheduling of Virtual Machine Reallocation 13 1.7 Intra-Application Communication 14 1.8 Related Works 17 Chapter 2 Learning-Based Memory Allocation Optimization 23 2.1 Collecting the Training Data 23 2.2 Normalizing the Training Data 24 2.3 Defining the Learning Features 26 2.4 Building the Learning Model 27 2.5 Experimental Setup 28 2.6 Accuracy of Predictions 30 Chapter 3 Allocation of Virtual Machines 33 3.1 Problem Formulation 33 3.2 Adaptive Fit Algorithm 35 3.3 Time Complexity of Adaptive Fit 39 3.4 Simulation Setup 39 3.5 Cost of Server Consolidation 41 3.6 Effectiveness of Server Consolidation 42 3.7 Saved Cost of Server Consolidation 42 Chapter 4 Allocation of Data Center Networks 45 4.1 Labeling the Network Links 46 4.2 Grouping the Network Links 48 4.3 Formatting Star Networks 50 4.4 Matrix Representation 53 4.5 Building Variants of Fat-tree Networks 57 4.6 Fault-Tolerant Resource Allocation 58 4.7 Fundamental Reallocation and Properties 60 4.8 Traffic Redirection and Server Migration 63 Chapter 5 Allocation of Servers 65 5.1 Problem Formulation 65 5.2 Multi-Step Reallocation 70 5.3 Generality of the Reallocation Mechanisms 72 5.4 Algorithms for Allocation and Reallocation 73 5.4.1 Listing All Reallocation (LAR) 73 5.4.2 Single-Pod Reallocation (SPR) 74 5.4.3 Multi-Pod Reallocation (MPR) 75 5.4.4 StarCube Allocation Procedure (SCAP) 76 5.4.5 Properties of the Algorithms 77 5.5 Simulation Setup 81 5.6 Efficiency of Server Allocation 83 5.7 Impact of the Size of Partitions 85 5.8 Cost of Reallocating Partitions 86 Chapter 6 Conclusions 89 Bibliography 92 Appendix 96
dc.language.iso	en
dc.subject	網路最佳化	zh_TW
dc.subject	雲端運算	zh_TW
dc.subject	巨量資料	zh_TW
dc.subject	資源最佳化	zh_TW
dc.subject	記憶體管理	zh_TW
dc.subject	效能成本權衡	zh_TW
dc.subject	效能保證	zh_TW
dc.subject	雲端運算	zh_TW
dc.subject	巨量資料	zh_TW
dc.subject	資源最佳化	zh_TW
dc.subject	記憶體管理	zh_TW
dc.subject	效能成本權衡	zh_TW
dc.subject	效能保證	zh_TW
dc.subject	網路最佳化	zh_TW
dc.subject	Performance Guarantee	en
dc.subject	Performance-Cost Trade-off	en
dc.subject	Network Optimization	en
dc.subject	Cloud Computing	en
dc.subject	Cloud Computing	en
dc.subject	Big Data	en
dc.subject	Resource Optimization	en
dc.subject	Memory Management	en
dc.subject	Performance-Cost Trade-off	en
dc.subject	Performance Guarantee	en
dc.subject	Network Optimization	en
dc.subject	Big Data	en
dc.subject	Resource Optimization	en
dc.subject	Memory Management	en
dc.title	一個具有成本效益的即時巨量資料處理系統	zh_TW
dc.title	A Cost-Effective System for Real-Time Big Data Processing	en
dc.type	Thesis
dc.date.schoolyear	104-1
dc.description.degree	博士
dc.contributor.oralexamcommittee	林宗男,周承復,吳曉光,林俊宏,蔡子傑
dc.subject.keyword	雲端運算,巨量資料,資源最佳化,記憶體管理,效能成本權衡,效能保證,網路最佳化,	zh_TW
dc.subject.keyword	Cloud Computing,Big Data,Resource Optimization,Memory Management,Performance-Cost Trade-off,Performance Guarantee,Network Optimization,	en
dc.relation.page	100
dc.rights.note	有償授權
dc.date.accepted	2016-02-01
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	電機工程學研究所	zh_TW
顯示於系所單位：	電機工程學系

文件中的檔案：

檔案	大小	格式
ntu-105-1.pdf 未授權公開取用	2.38 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。