Skip navigation

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets

Learn More
DSpace logo
English
中文
  • Browse
    • Communities
      & Collections
    • Publication Year
    • Author
    • Title
    • Subject
    • Advisor
  • Search TDR
  • Rights Q&A
    • My Page
    • Receive email
      updates
    • Edit Profile
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊工程學系
Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/98943
Title: 多圖形處理器上深度學習網路訓練的記憶體優化
Optimizing Memory Usage in Deep Network Training with Multiple GPUs
Authors: 吳榮哲
Rong-Jhe Wu
Advisor: 劉邦鋒
Pangfeng Liu
Keyword: 深度學習,管線平行化,激活檢查點,動態規劃,
Deep Learning,Pipeline Parallelism,Activation Checkpointing,Dynamic Programming,
Publication Year : 2025
Degree: 碩士
Abstract: 深度神經網路已成為廣泛成功的框架,應用於多種領域。然而,現代應用越來越依賴更大型的模型以提升性能。參數數量的快速增長常導致訓練過程中出現記憶體瓶頸。一種有效的解決方案是激活檢查點(activation checkpointing),該方法只在前向傳播中保存部分中間激活值,並在反向傳播時重新計算這些激活值,以降低記憶體消耗。本文聚焦於在多GPU環境下訓練深度神經網路時,最小化記憶體使用。我們採用流水線並行(pipeline parallelism)將模型分割成較小的階段並分布於多個設備,並結合檢查點技術,在負載重的情況下進一步減少記憶體需求。我們的目標是找到能夠在大規模多 GPU 訓練過程中優化記憶體效率的檢查點策略。
Deep neural networks have become a widely successful framework, applied in a wide range of applications. However, modern use cases increasingly rely on larger models to achieve better performance. This rapid growth in the number of parameters often results in memory bottlenecks during training. An effective approach to mitigate this issue is activation checkpointing, which involves storing only a subset of intermediate activations during the forward pass and recomputing them during the backward pass to reduce memory consumption. In this paper, we focus on minimizing memory usage when training deep neural networks across multiple GPUs. We employ pipeline parallelism to partition the model into smaller stages distributed across devices, and we apply checkpointing to further reduce memory demands under heavy workloads. Our goal is to identify checkpointing strategies that optimize memory efficiency during large-scale multi-GPU training.
URI: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/98943
DOI: 10.6342/NTU202504398
Fulltext Rights: 同意授權(全球公開)
metadata.dc.date.embargo-lift: 2025-08-21
Appears in Collections:資訊工程學系

Files in This Item:
File SizeFormat 
ntu-113-2.pdf929.27 kBAdobe PDFView/Open
Show full item record


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved