使用多GPU實現大型且快速的模擬量子退火

Jing-Wei Wu; 吳靖崴

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/85350

標題:	使用多GPU實現大型且快速的模擬量子退火 Enabling Large and Fast Simulated Quantum Annealing with Multi-GPU
作者:	Jing-Wei Wu 吳靖崴
指導教授:	洪士灝(Shih-Hao Hung)
關鍵字:	高效能計算,多圖形處理器加速,NVLink,模擬量子退火, High performance computing,Multi-GPU acceleration,NVLink,Simulated quantum annealing,
出版年 :	2022
學位:	碩士
摘要:	組合最佳化是指從所有可行解答的集合中尋找最佳解的過程，它在各種領域有許多重要的應用，如交通狀況模擬、蛋白質折疊模型、金融分析等。模擬量子退火（Simulated Quantum Annealing, SQA）是一種解決組合最佳化問題的啟發式演算法。 SQA在傳統電腦上模擬量子退火機器之量子穿隧效應，使其能有更高的機率找到總體最佳解。目前已有許多研究透過不同的硬體加速器來加速SQA，例如FPGA和GPU。大多數的研究是實作在單一加速器上，因此，SQA的規模會受限於單一加速器的計算能力和記憶體容量。在這篇論文中，我們提出了使用多GPU平行處理的做法來進一步加速SQA，並探索了幾種系統方面的設計來擴大SQA的規模。我們的成果與之前的作法相比，在使用8張A100 GPU 的情況下，達到7.2倍的速度提升，並且由於合計的記憶體提高至320GB，我們也將可解決的問題大小增加了8倍。此外，雖然CPU可使用較大的記憶體容量，但在解決同樣大的問題時，我們提出的多GPU的作法在執行時間上比CPU的作法快了約1000倍，並且在能耗效率上提高約100倍。 Combinatorial optimization is the process of finding an optimal solution from a set of all feasible solutions, which has many important applications in various fields, such as traffic flow simulation, protein folding models, financial analysis, etc. Simulated quantum annealing (SQA) is a heuristic algorithm for solving combinatorial optimization problems, which simulates the quantum tunneling phenomena of quantum annealing machines on traditional computers to increase the chance of finding globally optimal solutions. There have been several works of speeding up SQA with hardware accelerators, such as FPGA and GPU. Most of the works are implemented on a single accelerator and hence, the scale of SQA is limited by the computing power and memory capacity of the accelerator. In this thesis, we propose a multi-GPU approach to further accelerate SQA and explore several system-level designs to increase the scale of SQA. As a result, we achieve 7.2x speedup and enlarge the solvable problem size by 8 times with 8 A100 GPUs and a total of 320GB of GPU memory. While a CPU-based implementation can also solve a problem of the same size, our work offers ~1000x speedup for execution time and ~100x better power efficiency over the CPU-based implementation.
URI:	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/85350
DOI:	10.6342/NTU202201676
全文授權:	同意授權(限校園內公開)
電子全文公開日期:	2022-07-26
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
U0001-2407202220442100.pdf 授權僅限NTU校內IP使用（校園外請利用VPN校外連線服務）	1.73 MB	Adobe PDF

顯示文件完整紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料（如：文字、圖片、PDF）並使其易於取用。