利用異質計算平行化加速相依性結構矩陣遺傳演算法二代

Sung-Chi Li; 李松錡

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/68898

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	于天立(Tian-Li Yu)
dc.contributor.author	Sung-Chi Li	en
dc.contributor.author	李松錡	zh_TW
dc.date.accessioned	2021-06-17T02:40:59Z	-
dc.date.available	2017-08-24
dc.date.copyright	2017-08-24
dc.date.issued	2017
dc.date.submitted	2017-08-16
dc.identifier.citation	[1] G. M. Amdahl. Validity of the single processor approach to achieving large scale computing capabilities. Proceedings of the Spring Joint Computer Conference, pages 483–485, 1967. [2] P. A. Bosman and D. Thierens. Linkage neighbors, optimal mixing and forced im- provements in genetic algorithms. Proceedings of the Genetic and Evolutionary Computation Conference, pages 585–592, 2012. [3] R. de Bokx. Parallelizing the linkage tree genetic algorithm and searching for the optimal replacement for the linkage tree. Master thesis, Delft University of Technol- ogy, 2015. [4] T. S. Duque, D. E. Goldberg, and K. Sastry. Improving the efficiency of the ex- tended compact genetic algorithm. Proceedings of the Genetic and Evolutionary Computation Conference, pages 467–468, 2008. [5] R. A. Fisher and F. Yates. Statistical tables for biological, agricultural and medical research. Oliver & Boyd, London, 1938. [6] S.-H. Hsu and T.-L. Yu. Optimization by pairwise linkage detection, incremental linkage set, and restricted/back mixing: DSMGA-II. Proceedings of the Genetic and Evolutionary Computation Conference, pages 519–526, 2015. [7] P. Krömer, J. Platoš, V. Snášel, and A. Abraham. Many-threaded differential evolu- tion on the GPU. Proceedings of the Massively Parallel Evolutionary Computation on GPGPUs, pages 121–147, 2013. [8] S. Kullback and R. A. Leibler. On information and sufficiency. The Annals of Math- ematical Statistics, 22(1):79–86, 1951. [9] J.-M. Li, X.-J. Wang, R.-S. He, and Z.-X. Chi. An efficient fine-grained parallel genetic algorithm based on GPU-accelerated. International Federation for Infor- mation Processing International Conference on Network and Parallel Computing Workshops, pages 855–862, 2007. [10] L. Mussi, F. Daolio, and S. Cagnoni. Evaluation of parallel particle swarm optimiza- tion algorithms within the CUDA architecture. Information Sciences, 181(20):4642– 4657, 2011. [11] J. Ocenasek and M. Pelikan. Parallel mixed bayesian optimization algorithm: A scaleup analysis. Workshop Proceedings of the Genetic and Evolutionary Computa- tion Conference, 2004. [12] C. Patel. Different optimization strategies and performance evaluation of reduction on multicore CUDA architecture. International Journal of Engineering, 3(4):1567– 1570, 2014. [13] M. Pelikan and D. E. Goldberg. Hierarchical boa solves ising spin glasses and maxsat. Proceedings of the Genetic and Evolutionary Computation Conference, pages 1271–1282, 2003. [14] P. Pospichal, J. Jaros, and J. Schwarz. Parallel genetic algorithm on the CUDA architecture. Proceedings of the European Conference on the Applications of Evo- lutionary Computation, pages 442–451, 2010. [15] S.M.Poulding,J.P.Staunton,andN.J.Burles.Fullimplementationofanestimation of distribution algorithm on a GPU. Proceedings of the Genetic and Evolutionary Computation Conference 2011, GPUs for Genetic and Evolutionary Computation Competition, 2011. [16] J.-H.Seo,E.-S.Ko,andY.-H.Kim.PerformancecomparisonofGPUswithagenetic algorithm based on CUDA. Advanced Science and Technology Letters, 65:36–40, 2014. [17] C.-Y.ShaoandT.-L.Yu.SpeedingupmodelbuildingforECGAonCUDAplatform. Proceedings of the Genetic and Evolutionary Computation Conference, pages 1197– 1204, 2013. [18] D. L. Souza, G. D. Monteiro, T. C. Martins, V. A. Dmitriev, and O. N. Teixeira. Pso- gpu: accelerating particle swarm optimization in CUDA-based graphics processing units. Proceedings of the Genetic and Evolutionary Computation Conference, pages 837–838, 2011. [19] D. Thierens. The linkage tree genetic algorithm. Proceedings of the International Conference on Parallel Problem Solving from Nature, pages 264–273, 2010. [20] D. Thierens and P. A. Bosman. Optimal mixing evolutionary algorithms. Pro- ceedings of the Genetic and Evolutionary Computation Conference, pages 617–624, 2011. [21] T.-L. Yu, D. E. Goldberg, A. Yassine, and Y.-P. Chen. Genetic algorithm design in- spired by organizational theory: Pilot study of a dependency structure matrix driven genetic algorithm. Proceedings of the Genetic and Evolutionary Computation Con- ference, pages 1620–1621, 2003. [22] T.-L. Yu, K. Sastry, D. E. Goldberg, and M. Pelikan. Population sizing for entropy- based model building in discrete estimation of distribution algorithms. Proceedings of the Genetic and Evolutionary Computation Conference, pages 601–608, 2007. [23] A. L. Zobrist. A new hashing method with application for game playing. Interna- tional Congress and Convention Association journal, 13(2):69–73, 1970.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/68898	-
dc.description.abstract	相依式結構矩陣遺傳演算法二代在許多指標問題上有著良好的效能，此演算法透過偵測問題結構來建構模型，並透過兩個族群混合運算子來尋求更好的解。建構模型雖是造成其效能良好的原因之一，但這個過程卻十分耗時；此外族群混合運算子在計算混合後的適應分數亦需要不少時間。本篇論文透過平行化的方式來加速這兩個耗時的運算。在建構模型上的加速使用圖形處理器，首先提出演算法概念上相同的加速版本，而後再提出利用修改演算法降低模型準確度，來獲得更高加速的第二個版本。在混合運算子的平行化上考慮到問題特性，故透過中央處理器進行加速。模型建構的及混合運算子的加速在我們使用的幾個常見的測試問題中，都展現出穩定的加速成果，且可應用於理論最大值將近兩萬四千位元左右之問題上。模型建構式的基因演算法受限於模型建構的運算過程而難以平行化，此研究透過微幅降低模型的準確度及對演算法本身的修改，結合兩種異質計算的平行化後，顯示出模型建構式的基因演算法還是有利用平行化的方式來達到大幅加速的可能。	zh_TW
dc.description.abstract	DSMGA-II has good performances on many benchmark problems. DSMGA-II builds a model by detecting the problem structure and uses two mixing operators to generate better offsprings. The model building is one of the factors to achieve such performances. However, this process is time-consuming. Two mixing operators also take plenty of time when evaluating the fitness function. This thesis aims to use parallelisms on two heterogeneous computing platforms to speed up the model building process and the two mixing operators. I use GPU to parallelize the model building process and CPU to parallelize two mixing operators. I propose two speedup schemes for model building. The first scheme is algorithmically identical to the original DSMGA-II, and the second scheme sacrifices some model accuracy for further speedup. As for speeding up mixing operators, I use CPU instead of GPU to perform parallelization due to the core ability and also use it to parallelize the mixing operators with lossy implementations. On several commonly used benchmark problems, the speedup on CPU and GPU are stable, and the largest theoretical applicable proble size is up to about 24 thousand bits. MBGA is hard to be parallelized due to the model building process. However, this thesis shows a possibility to gain huge speedup by combining two heterogeneous parallelisms and lossy designs.	en
dc.description.provenance	Made available in DSpace on 2021-06-17T02:40:59Z (GMT). No. of bitstreams: 1 ntu-106-R04921056-1.pdf: 4619298 bytes, checksum: 27a09f06ca3766a6148a3d6a179337e0 (MD5) Previous issue date: 2017	en
dc.description.tableofcontents	口試委員會審定書 iii Acknowledgements v 誌謝 vii 摘要 ix Abstract xi 1 Introduction 1 2 DSMGA-II 5 3 Parallel Techniques 11 3.1 CUDA.................................... 12 3.1.1 Background............................. 12 3.1.2 ProgrammingModel ........................ 13 3.1.3 Design Constraints and Optimization Guidelines . . . . . . . . . 15 3.2 OpenMP................................... 17 3.2.1 Background............................. 17 3.2.2 ProgrammingModel ........................ 18 4 Speedup the Model Building 21 4.1 ExperimentsSettings ............................ 21 4.1.1 HardwareSpecifications ...................... 21 4.1.2 TestProblems............................ 22 4.2 CUDABasedSpeedup ........................... 25 4.2.1 Flow ................................ 25 4.2.2 FastCounting ............................ 26 4.3 LosslessScheme .............................. 28 4.3.1 SpeedUpBuildingDSM...................... 28 4.3.2 SpeedUpBuildingLinkageModel ................ 29 4.3.3 ExperimentResults......................... 31 4.3.4 TheoreticalAnalysis ........................ 33 4.4 LossyScheme:DSMGA-IIwithLMA................... 36 4.4.1 DesignConcept........................... 37 4.4.2 ExperimentResults......................... 39 5 Further Speedup with CPU Parallelism 43 5.1 ParallelImplementation........................... 43 5.1.1 LosslessSpeedups ......................... 44 5.1.2 Lossy Speedups on Mixing Operators: DSMGA-II with LMA PlusMP............................... 47 5.2 SpeedupResultsofDSMGA-IIwithLMAplusMP . . . . . . . . . . . . 49 5.3 ScalabilityofDSMGA-IIwithLMAplusMP . . . . . . . . . . . . . . . 54 6 Conclusions............................... 59 A Unsuccessful Trials 61 Bibliography 65
dc.language.iso	en
dc.subject	平行化	zh_TW
dc.subject	相依性結構矩陣遺傳演算法二代	zh_TW
dc.subject	異質化平行運算	zh_TW
dc.subject	DSMGA-II	en
dc.subject	CUDA	en
dc.subject	parallelization	en
dc.subject	Heterogeneous computing	en
dc.title	利用異質計算平行化加速相依性結構矩陣遺傳演算法二代	zh_TW
dc.title	Speeding up DSMGA-II with heterogeneous parallelisms	en
dc.type	Thesis
dc.date.schoolyear	105-2
dc.description.degree	碩士
dc.contributor.oralexamcommittee	雷欽隆(Chin-Laung Lei),張時中(Shi-Chung Chang)
dc.subject.keyword	相依性結構矩陣遺傳演算法二代,異質化平行運算,平行化,	zh_TW
dc.subject.keyword	DSMGA-II,Heterogeneous computing,parallelization,CUDA,	en
dc.relation.page	67
dc.identifier.doi	10.6342/NTU201703258
dc.rights.note	有償授權
dc.date.accepted	2017-08-17
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	電機工程學研究所	zh_TW
顯示於系所單位：	電機工程學系

文件中的檔案：

檔案	大小	格式
ntu-106-1.pdf 未授權公開取用	4.51 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。