Please use this identifier to cite or link to this item:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/21973| Title: | 對鬆散化梯度的資料聚集進行通訊量優化 Communication Usage Optimization of Gradient Sparsification with Aggregation |
| Authors: | Sheng-Ping Wang 王盛平 |
| Advisor: | 劉邦鋒(Pangfeng Liu) |
| Keyword: | 平行處理,分散式系統,深度學習,鬆散化梯度, Parallel Processing,Distributed Systems,Deep Learning,Gradient Sparsification, |
| Publication Year : | 2018 |
| Degree: | 碩士 |
| Abstract: | 網路的資料傳輸是分散式深度學習在增加訓練機器數量時所會面臨到的瓶頸,而解決辦法之一是將所交換的梯度進行鬆散化的壓縮。我們發現在資料交換的過程中,由伺服器傳輸至訓練機器的資料量大小,會隨著由訓練機器所傳出的梯度間的相似性增高而減少。我們由初步實驗觀察到,只有少部分的參數會在短期的時間之中多次計算出較大的梯度。藉由此觀察,我們提出了幾種讓訓練機器選擇傳出梯度的演算法,並透過實驗驗證我們的做法可使由伺服器所傳出的資料量減少,並縮短訓練週期所需的時間,使訓練模型較傳統壓縮方式更快到達收斂。 Communication usage is a bottleneck of scaling workers for distributed deep learning. One solution is to compress the exchanged gradients into sparse format with gradient sparsification. We found that the send cost of server, which is the aggregated size of sparse gradient, can be reduced by the gradient selection from workers. Following an observation that only a few gradients are significantly large and in a short period of time, we proposed several gradient selection algorithms based on different metrics. Experiment showed that our proposed method can reduce the aggregated size for server, and the reduction in time per iteration can make the convergence rate faster than traditional sparsification. |
| URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/21973 |
| DOI: | 10.6342/NTU201801738 |
| Fulltext Rights: | 未授權 |
| Appears in Collections: | 資訊工程學系 |
Files in This Item:
| File | Size | Format | |
|---|---|---|---|
| ntu-107-1.pdf Restricted Access | 703.78 kB | Adobe PDF |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.
