一套適用於超級運算之快速、可擴展且完善的量子電路模擬器

王傳啓; CHUAN CHI WANG

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/101269

標題:	一套適用於超級運算之快速、可擴展且完善的量子電路模擬器 A quick, scalable, and comprehensive quantum circuit simulation for supercomputing
作者:	王傳啓 CHUAN CHI WANG
指導教授:	洪士灝 Shih-Hao Hung
關鍵字:	量子運算,量子電路優化量子電路模擬平行化程式高效能運算 quantum computing,quantum circuit optimizationquantum circuit simulationparallel programminghigh-performance computation
出版年 :	2026
學位:	博士
摘要:	量子電路模擬在以經典計算資源開發與評估新型量子演算法的研究中，扮演著不容替代的關鍵。雖然全態量子電路模擬（full-state quantum circuit simulation）已廣泛應用於原型設計與除錯，但其模擬時間會根據量子位元總數急遽地上升，對於大型的分散式系統來說，構成重大挑戰。本論文擬定了一個具可擴充性的模擬框架，藉由充分利用平行硬體的架構特性，並針對張量積運算進行高效優化，以提升全態量子電路模擬之效率。為了達到可擴充性的最佳化，本論文也擬定了儲存裝置與遠端記憶體直接存取這兩種模擬技術，以最小成本構築大規模模擬的基礎。高效能方面，框架由兩個核心模組組成：集成優化模組 (Swarm Optimization) 與模擬流模組 (Flow Simulation)。此框架由兩個核心模組組成：集成優化模組與模擬流模組。集成優化模組包含多種電路層級的優化技術，包括快取運行、閘合併、無限制對角線閘合併與張量積加速。模擬流模組則根據電路特性動態切換至最適合的模擬方式，並完全由研究團隊自主開發，不仰賴其他的商業用函式庫，以確保效能的純粹性。不僅如此，本論文還繼續在特定的 Quantum Approximate Optimization Algorithm (QAOA) 電路做進一步地對角化矩陣的分析與優化。在實驗評估中，本研究分別於配備八張 NVIDIA A100 與 H100 GPU 的 DGX-A100 與 DGX-H100 工作站上，進行閘級與電路級的效能基準測試。與 QuEST、Aer Simulator（基於 CUDA Thrust 與 cuQuantum 後端）、以及 HyQuas 等業界先進模擬器相比，本研究在閘級測試中達到最高 45.3 倍的加速效能，在電路級測試中則實現 12.7 倍的加速。此外，亦透過消融實驗系統性地評估各項優化功能對整體效能的實際貢獻。此模擬器體現三項核心傳統：高效能模擬器、三層式優化機制，以及精細的效能調校，三者交織呈現極致模擬速度。總而言之，本論文在加速全態量子電路模擬領域再次突破，並預期能夠有效促進未來創新量子演算法的開發延續。 Quantum circuit simulation is essential for advancing and assessing new algorithms through the use of traditional computing resources. Although full-state quantum circuit simulation has been widely adopted for prototyping and debugging, the execution time grows exponentially as the qubit count rises, posing significant challenges for large-scale systems. This study proposes a scalable simulation framework that fully leverages parallel hardware architectures and optimizes tensor product operations to enhance simulation efficiency. Two simulation paradigms, storage-based and RDMA-based methods, are introduced to achieve scalability with minimal cost. The framework consists of two major components: the Swarm Optimization Module, which performs circuit-level optimizations, and the Flow Simulation Module, which dynamically selects the most suitable simulation strategy. The optimization module integrates multiple techniques, including cache execution, gate fusion, unrestricted diagonal gate fusion, and tensor product acceleration. The Flow Simulation Module is entirely developed in-house without reliance on third-party libraries, ensuring pure performance evaluation. In the experimental evaluation, benchmarks were conducted on both DGX-A100 and DGX-H100 workstations, each equipped with eight NVIDIA A100 and H100 GPUs, respectively. Compared with industry-leading simulators, including QuEST, Aer Simulator (with CUDA Thrust and cuQuantum backends), and HyQuas, the proposed framework achieves up to 45.3× speedup in gate-level benchmarks and 12.7× speedup in circuit-level benchmarks. Furthermore, ablation experiments were performed to systematically assess the practical contributions of each optimization technique. Moreover, the Quantum Approximate Optimization Algorithm (QAOA) case study reveals the performance improvements brought by diagonal matrix optimizations. Collectively, these results demonstrate that the proposed framework represents an important advancement in accelerating full-state quantum circuit simulation, paving the way for more efficient exploration of innovative quantum algorithms.
URI:	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/101269
DOI:	10.6342/NTU202504792
全文授權:	未授權
電子全文公開日期:	N/A
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
ntu-114-1.pdf 未授權公開取用	12.6 MB	Adobe PDF

顯示文件完整紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料（如：文字、圖片、PDF）並使其易於取用。