Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊工程學系
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/85350
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor洪士灝(Shih-Hao Hung)
dc.contributor.authorJing-Wei Wuen
dc.contributor.author吳靖崴zh_TW
dc.date.accessioned2023-03-19T22:59:27Z-
dc.date.copyright2022-07-26
dc.date.issued2022
dc.date.submitted2022-07-25
dc.identifier.citation[1] Nvidia dgx a100. https://www.nvidia.com/content/dam/en-zz/Solutions/Data-Center/nvidia-dgx-a100-datasheet.pdf. [2] Bernhard H Korte, Jens Vygen, B Korte, and J Vygen. Combinatorial optimization, volume 1. Springer, 2011. [3] Florian Neukart, Gabriele Compostella, Christian Seidel, David Von Dollen, Sheir Yarkoni, and Bob Parney. Traffic flow optimization using a quantum annealer. Frontiers in ICT, 4:29, 2017. [4] Davide Venturelli and Alexei Kondratyev. Reverse quantum annealing approach to portfolio optimization problems. Quantum Machine Intelligence, 1(1):17–30, 2019. [5] Alejandro Perdomo-Ortiz, Neil Dickson, Marshall Drew-Brook, Geordie Rose, and Alán Aspuru-Guzik. Finding low-energy conformations of lattice protein models by quantum annealing. Scientific reports, 2(1):1–7, 2012. [6] Hayato Ushijima-Mwesigwa, Christian FA Negre, and Susan M Mniszewski. Graph partitioning using quantum annealing on the d-wave system. In Proceedings of the Second International Workshop on Post Moores Era Supercomputing, pages 22–29, 2017. [7] Barry A Cipra. The ising model is np-complete. SIAM News, 33(6):1–3, 2000. [8] Peter JM Van Laarhoven and Emile HL Aarts. Simulated annealing. In Simulated annealing: Theory and applications, pages 7–15. Springer, 1987. [9] Aleta Berk Finnila, MA Gomez, C Sebenik, Catherine Stenson, and Jimmie D Doll. Quantum annealing: A new method for minimizing multidimensional functions. Chemical physics letters, 219(5-6):343–348, 1994. [10] Bettina Heim, Troels F Rønnow, Sergei V Isakov, and Matthias Troyer. Quantum versus classical annealing of ising spin glasses. Science, 348(6231):215–217, 2015. [11] D-wave systems. https://www.dwavesys.com/. [12] Masuo Suzuki. Relationship between d-dimensional quantal spin systems and (d+1)-dimensional ising systems: Equivalence, critical exponents and systematic approximants of the partition function and spin correlations. Progress of theoretical physics, 56(5):1454–1469, 1976. [13] Masuo Suzuki, Seiji Miyashita, and Akira Kuroda. Monte carlo simulation of quantum spin systems. i. Progress of Theoretical Physics, 58(5):1377–1387, 1977 [14] Hasitha Muthumala Waidyasooriya and Masanori Hariyama. A gpu-based quantum annealing simulator for fully-connected ising models utilizing spatial and temporal parallelism. IEEE Access, 8:67929–67939, 2020. [15] Yi-Hua Chung, Cheng-Jhih Shih, and Shih-Hao Hung. Accelerating simulated quantum annealing with gpu and tensor cores. In International Conference on High Performance Computing, pages 174–191. Springer, 2022. [16] Hasitha Muthumala Waidyasooriya and Masanori Hariyama. Highly-parallel fpga accelerator for simulated quantum annealing. IEEE Transactions on Emerging Topics in Computing, 9(4), 2019. [17] Masanao Yamaoka, Chihiro Yoshimura, Masato Hayashi, Takuya Okuyama, Hidetaka Aoki, and Hiroyuki Mizuno. A 20k-spin ising chip to solve combinatorial optimization problems with cmos annealing. IEEE Journal of Solid-State Circuits, 51(1):303–309, 2015. [18] Sergey Bravyi. Monte carlo simulation of stoquastic hamiltonians. arXiv preprint arXiv:1402.2295, 2014. [19] Elizabeth Crosson and Aram W Harrow. Simulated quantum annealing can be exponentially faster than classical simulated annealing. In 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS), pages 714–723. IEEE, 2016. [20] Takuya Okuyama, Masato Hayashi, and Masanao Yamaoka. An ising computer based on simulated quantum annealing by path integral monte carlo method. In 2017 IEEE international conference on rebooting computing (ICRC), pages 1–6. IEEE, 2017. [21] Hasitha Muthumala Waidyasooriya and Masanori Hariyama. Temporal and spatial parallel processing of simulated quantum annealing on a multicore cpu. The Journal of Supercomputing, 78(6):8733–8750, 2022. [22] Stefano Markidis, Steven Wei Der Chien, Erwin Laure, Ivy Bo Peng, and Jeffrey S Vetter. Nvidia tensor core programmability, performance & precision. In 2018 IEEE international parallel and distributed processing symposium workshops (IPDPSW), pages 522–531. IEEE, 2018. [23] Jack Choquette, Wishwesh Gandhi, Olivier Giroux, Nick Stam, and Ronny Krashinsky. Nvidia a100 tensor core gpu: Performance and innovation. IEEE Micro, 41(2):29–35, 2021. [24] Matej Špet’ko, Ondřej Vysockỳ, Branislav Jansík, and Lubomír Říha. Dgx-a100 face to face dgx-2—performance, power and thermal behavior evaluation. Energies, 14(2):376, 2021. [25] Gset. https://web.stanford.edu/~yyye/yyye/Gset/. [26] Stephan Mertens. Number partitioning. Computational complexity and statistical physics, page 125, 2006. [27] cublas. https://docs.nvidia.com/cuda/cublas/index.html. [28] Nvidia turing architecture. https://images.nvidia.com/aem-dam/en-zz/Solutions/design-visualization/technologies/turing-architecture/NVIDIA-Turing-Architecture-Whitepaper.pdf. [29] Cutlass. https://github.com/NVIDIA/cutlass. [30] Tobias Maltenberger, Ivan Ilic, Ilin Tolovski, and Tilmann Rabl. Evaluating multi-gpu sorting with modern interconnects. SIGMOD. ACM, New York, NY, USA, 2022. [31] Nvidia ampere architecture. https://www.nvidia.com/content/PDF/nvidia-ampere-ga-102-gpu-architecture-whitepaper-v2.pdf. [32] Una Benlic and Jin-Kao Hao. Breakout local search for the max-cutproblem. Engineering Applications of Artificial Intelligence, 26(3):1162–1173, 2013.
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/85350-
dc.description.abstract組合最佳化是指從所有可行解答的集合中尋找最佳解的過程,它在各種領域有許多重要的應用,如交通狀況模擬、蛋白質折疊模型、金融分析等。模擬量子退火(Simulated Quantum Annealing, SQA)是一種解決組合最佳化問題的啟發式演算法。 SQA在傳統電腦上模擬量子退火機器之量子穿隧效應,使其能有更高的機率找到總體最佳解。目前已有許多研究透過不同的硬體加速器來加速SQA,例如FPGA和GPU。大多數的研究是實作在單一加速器上,因此,SQA的規模會受限於單一加速器的計算能力和記憶體容量。在這篇論文中,我們提出了使用多GPU平行處理的做法來進一步加速SQA,並探索了幾種系統方面的設計來擴大SQA的規模。我們的成果與之前的作法相比,在使用8張A100 GPU 的情況下,達到7.2倍的速度提升,並且由於合計的記憶體提高至320GB,我們也將可解決的問題大小增加了8倍。此外,雖然CPU可使用較大的記憶體容量,但在解決同樣大的問題時,我們提出的多GPU的作法在執行時間上比CPU的作法快了約1000倍,並且在能耗效率上提高約100倍。zh_TW
dc.description.abstractCombinatorial optimization is the process of finding an optimal solution from a set of all feasible solutions, which has many important applications in various fields, such as traffic flow simulation, protein folding models, financial analysis, etc. Simulated quantum annealing (SQA) is a heuristic algorithm for solving combinatorial optimization problems, which simulates the quantum tunneling phenomena of quantum annealing machines on traditional computers to increase the chance of finding globally optimal solutions. There have been several works of speeding up SQA with hardware accelerators, such as FPGA and GPU. Most of the works are implemented on a single accelerator and hence, the scale of SQA is limited by the computing power and memory capacity of the accelerator. In this thesis, we propose a multi-GPU approach to further accelerate SQA and explore several system-level designs to increase the scale of SQA. As a result, we achieve 7.2x speedup and enlarge the solvable problem size by 8 times with 8 A100 GPUs and a total of 320GB of GPU memory. While a CPU-based implementation can also solve a problem of the same size, our work offers ~1000x speedup for execution time and ~100x better power efficiency over the CPU-based implementation.en
dc.description.provenanceMade available in DSpace on 2023-03-19T22:59:27Z (GMT). No. of bitstreams: 1
U0001-2407202220442100.pdf: 1770514 bytes, checksum: ca00c75c54c49579aab788f74309d587 (MD5)
Previous issue date: 2022
en
dc.description.tableofcontentsVerification Letter from the Oral Examination Committee - i Acknowledgements - iii 摘要 - v Abstract - vii Contents - ix List of Figures - xi Chapter 1 Introduction - 1 Chapter 2 Background - 3 2.1 Ising Model - 3 2.2 Simulated Quantum Annealing - 4 2.3 Implementation of SQA - 7 2.4 Scalability Issues for Large-Scale SQA - 11 Chapter 3 Methodology - 13 3.1 Saving Memory with a Compact Data Type - 13 3.2 Scaling the Problem Size with Host Memory - 17 3.3 Scaling the Problem Size with Memories across GPUs - 19 3.4 Utilizing Trotter Parallelism - 22 Chapter 4 Evaluation - 25 4.1 Experimental Environment - 25 4.2 Execution Time per Step - 26 4.3 Solution Quality - 28 4.4 Comparison to Other SQA Work - 31 Chapter 5 Conclusion - 35 References - 37
dc.language.isoen
dc.subject模擬量子退火zh_TW
dc.subject高效能計算zh_TW
dc.subject多圖形處理器加速zh_TW
dc.subjectNVLinkzh_TW
dc.subjectSimulated quantum annealingen
dc.subjectHigh performance computingen
dc.subjectMulti-GPU accelerationen
dc.subjectNVLinken
dc.title使用多GPU實現大型且快速的模擬量子退火zh_TW
dc.titleEnabling Large and Fast Simulated Quantum Annealing with Multi-GPUen
dc.typeThesis
dc.date.schoolyear110-2
dc.description.degree碩士
dc.contributor.oralexamcommittee李建模(Chien-Mo Li),黃鐘揚(Chung-Yang Huang),管希聖(Hsi-Sheng Goan),江介宏(Jie-Hong Jiang)
dc.subject.keyword高效能計算,多圖形處理器加速,NVLink,模擬量子退火,zh_TW
dc.subject.keywordHigh performance computing,Multi-GPU acceleration,NVLink,Simulated quantum annealing,en
dc.relation.page41
dc.identifier.doi10.6342/NTU202201676
dc.rights.note同意授權(限校園內公開)
dc.date.accepted2022-07-25
dc.contributor.author-college電機資訊學院zh_TW
dc.contributor.author-dept資訊工程學研究所zh_TW
dc.date.embargo-lift2022-07-26-
顯示於系所單位:資訊工程學系

文件中的檔案:
檔案 大小格式 
U0001-2407202220442100.pdf
授權僅限NTU校內IP使用(校園外請利用VPN校外連線服務)
1.73 MBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved