Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊網路與多媒體研究所
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/101421
標題: 基於遠端直接記憶體存取之跨機器低延遲圖形處理器發佈/訂閱模式
Design of a Low-Latency RDMA-Based Publish/Subscribe Framework for Inter-Machine GPU Communications
作者: 姜任懋
Ren-Mao Jiang
指導教授: 洪士灝
Shih-Hao Hung
關鍵字: 發布/訂閱系統,圖形處理器 (GPU)遠端直接記憶體存取 (RDMA)共享記憶體中介軟體高效能運算
Publish/Subscribe System,Graphics Processing Unit (GPU)Remote Direct Memory Access (RDMA)Shared MemoryMiddlewareHigh-Performance Computing
出版年 : 2026
學位: 碩士
摘要: 隨著自駕車、人工智慧與即時影像處理等高效能運算應用的興起,GPU 已成為資料生成與處理的核心運算單元。現有的發佈/訂閱中介軟體大多針對 CPU 架構設計。然而對於 GPU 裝置記憶體之間的發佈/訂閱會涉及多次冗餘的主機記憶體及 GPU 裝置記憶體之間的複製與傳輸。這不僅造成顯著的延遲,更加劇了 PCIe 匯流排的頻寬競爭,限制了系統的整體平行處理能力。儘管現有的 GPU 發佈/訂閱方案能透過共享記憶體優化節點內的傳輸,但其仍受限於單一主機的 GPU 計算資源及記憶體,無法滿足分散式大規模 GPU 叢集的需求。
為解決上述問題,本論文提出了一種基於 Remote Direct Memory Access (RDMA) 的低延遲跨節點 GPU 感知發佈/訂閱框架——Remote GPU-Aware Publish/Subscribe (RGAPS)。RGAPS 將單一機器 GPU 發佈/訂閱的設計理念從單機環境延伸至跨機器 GPU 通訊。在架構設計上,本研究結合了 RDMA Write with Immediate 機制來實現繞過 CPU 的高效節點間傳輸,並利用接收端的共享 GPU 記憶體技術,將接收到的資料寫入為單一共享副本資料後供多個本地訂閱者並行存取。相較於傳統架構中開銷隨訂閱者數量呈線性增長,本方法消除了對冗餘記憶體複製操作的需求。
實驗結果顯示,在節點內分發的可擴展性方面,RGAPS 成功解耦了分發延遲與訂閱者數量的關聯;當傳輸 1GB 的大型負載給 8 個訂閱者時,本架構展現了極低的延遲與近乎恆定的效能表現,而傳統方法則因 PCIe 頻寬競爭導致延遲大幅攀升。總結而言,RGAPS 有效突破了跨節點 GPU 通訊的軟體與硬體瓶頸,為需要高頻寬、低延遲的大型分散式 GPU 應用及資料傳輸提供了一個具備高擴展性的通訊解決方案。
With the rise of high-performance computing applications such as autonomous driving, artificial intelligence, and real-time image processing, GPU has become the core computational unit for data generation and processing. Most existing publish/subscribe middleware is designed for CPU architectures. However, publish/subscribe operations between GPU device memories involve multiple redundant copies and transmissions between host memory and GPU device memory. This not only causes significant latency but also exacerbates PCIe bus bandwidth contention, limiting the system's overall parallel processing capability. Although existing GPU publish/subscribe schemes can optimize intra-node transmission through shared memory, they remain limited to the GPU computational resources and memory of a single host, failing to meet the needs of distributed large-scale GPU clusters.
To address the aforementioned issues, this thesis proposes a low-latency, cross-node GPU-aware publish/subscribe framework based on Remote Direct Memory Access (RDMA), named Remote GPU-Aware Publish/Subscribe (RGAPS). RGAPS extends the design concept of GPU publish/subscribe from a single-machine environment to cross-machine GPU communication. In terms of architectural design, this study combines the RDMA Write with Immediate mechanism to achieve efficient inter-node transmission that bypasses the CPU. Furthermore, it utilizes shared GPU memory technology on the receiving end to materialize received data as a single shared copy, allowing multiple local subscribers to access it in parallel. In contrast to conventional architectures where overhead scales linearly with the subscriber count, this approach eradicates the need for redundant memory copy operations.
Experimental results demonstrate that in terms of intra-node distribution scalability, RGAPS successfully decouples distribution latency from the number of subscribers. When transmitting a large 1GB payload to 8 subscribers, the proposed architecture exhibits extremely low latency and near-constant performance, whereas the traditional approach sees latency rise significantly due to PCIe bandwidth contention. In conclusion, RGAPS effectively overcomes the software and hardware bottlenecks of cross-node GPU communication, providing a highly scalable communication solution for large-scale distributed GPU applications requiring high bandwidth and low latency.
URI: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/101421
DOI: 10.6342/NTU202600379
全文授權: 同意授權(全球公開)
電子全文公開日期: 2026-02-04
顯示於系所單位:資訊網路與多媒體研究所

文件中的檔案:
檔案 大小格式 
ntu-114-1.pdf7.8 MBAdobe PDF檢視/開啟
顯示文件完整紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved