請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/73765
完整後設資料紀錄
DC 欄位 | 值 | 語言 |
---|---|---|
dc.contributor.advisor | 周承復(Cheng-Fu Chou) | |
dc.contributor.author | Yu-Nuo Juan | en |
dc.contributor.author | 阮昱諾 | zh_TW |
dc.date.accessioned | 2021-06-17T08:09:43Z | - |
dc.date.available | 2020-08-20 | |
dc.date.copyright | 2019-08-20 | |
dc.date.issued | 2019 | |
dc.date.submitted | 2019-08-16 | |
dc.identifier.citation | [1] M. Abadi and al. Tensorflow: Largescale machine learning on heterogeneous systems.
[2] C. M. Bishop. Pattern recognition and machine learning. springer, 2006. [3] L. Bottou. Largescale machine learning with stochastic gradient descent. In Proceedings of COMPSTAT’2010, pages 177–186, 2010. [4] L. Bottou. Stochastic gradient descent tricks. In Neural Networks: Tricks of the Trade Second Edition, pages 421–436, 2012. [5] C. Brezinski and M. R. Zaglia. Extrapolation methods: theory and practice, volume 2. Elsevier, 2013. [6] S. I. J. S. C. Szegedy, V. Vanhoucke and Z. Wojna. Rethinking the inception architecture for computer vision. In CVPR'16, 2016. [7] Caffe2. Caffe2: A new lightweight, modular, and scalable deep learning framework. [8] J. Chen, R. Monga, S. Bengio, and R. Jozefowicz. “revisiting distributed synchronous sgd. In arXiv preprint arXiv:1604.00981, 2016. [9] T. Chen, M. Li, Y. Li, M. Lin, N. Wang, M. Wang, T. Xiao, B. Xu, C. Zhang, and Z. Zhang. “mxnet: A flexible and efficient machine learning library for heterogeneous distributed systems, 2015. [10] T. M. Chilimbi, Y. Suzue, J. Apacible, and K. Kalyanaraman. Project adam: Building an efficient and scalable deep learning training system. In OSDI, 2014. [11] P. J. Davis. Interpolation And Approximation. Courier Corporation, 1975. [12] J. Dean, G. Corrado, R. Monga, K. Chen, M. Devin, M. Mao, A. Senior, P. Tucker, K. Yang, and Q. V. L. et al. Large scale distributed deep networks. In Advances in neural information processing systems, 2012. [13] J. Deng, W. Dong, R. Socher, K. L. L.J. Li and, and L. FeiFei. Imagenet: A largescale hierarchical image database. In CVPR'09,, 2009. [14] S. Hadjis, C. Zhang, D. I. I. Mitliagka and, and C. R´e. Omnivore: An optimizer for multidevice deep learning on cpus and gpus. In arXiv preprint arXiv:1606.04487, 2016. [15] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In CVPR'16,, page 770–778, 2016. [16] G. Inc. tf cnn benchmarks: High performance benchmarks, 1975. [17] D. P. Kingma and J. L. Ba. Adam: A method for stochastic optimization. In ICLR, 2015. [18] L. Kleinrock. Queueing systems, 1975. [19] Y. LeCun, Y. Bengio, and G. Hinton. Deep learning. Nature, 521(7553):436–444, 2015. [20] M. Li, D. G. Andersen, A. J. S. J. W. Park, V. J. A. Ahmed, J. Long, E. J. Shekita, and B.Y. Su. Scaling distributed machine learning with the parameter server. In OSDI, 2014. [21] S.H. Lin, M. Paolieri, C.F. Chou, and L. Golubchik. A modelbased approach to streamlining distributed training for asynchronous sgd. In 2018 IEEE 26th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS), 2018. [22] M. Reiser. A queueing network analysis of computer communication networks with window flow control. IEEE Transactions on Communications, 27(8):1199–1209, 1979. [23] M. Reiser and S. S. Lavenberg. Meanvalue analysis of closed multichain queuing networks, 1980. [24] G. A. Seber and A. J. Lee. Linear Regression Analysis, volume 329. John Wiley and Sons Inc., 2012. [25] K. Simonyan and A. Zisserman. Very deep convolutional networks for largescale image recognition. In arXiv preprint arXiv:1409.1556, 2014. | |
dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/73765 | - |
dc.description.abstract | 深度神經網路最近在各領域獲得了巨大的成功,並吸引了更多世界各地學者的目光。大量的訓練工作考驗著軟硬體的發展。分散式學習是一種常見的加速方式。在這篇論文中我們會提出解決擴展學習環境的其中一個問題,也會解釋整個模型與背後使用的工具。 | zh_TW |
dc.description.abstract | Deep Neural Networks(DNNs) is very successful and has drawn more and more attentions from researchers all over the world. A huge demand of training jobs are challenging the development of both software tools and hardware systems. Distributed training is a common approach to speed up these jobs. In this paper, we propose a new method to address one of the problem in expanding the scale of your training environment, and we will also explain the model and tools behind. | en |
dc.description.provenance | Made available in DSpace on 2021-06-17T08:09:43Z (GMT). No. of bitstreams: 1 ntu-108-R05944037-1.pdf: 2400923 bytes, checksum: 6e26ee35a35010911b6eb3a7736454ae (MD5) Previous issue date: 2019 | en |
dc.description.tableofcontents | Contents
口試委員會審定書iii 誌謝v Acknowledgements vii 摘要ix Abstract xi 1 Introduction 1 1.1 Parameter Server Architecture . . . . . . . . . . . . . . . . . . . . . . . 4 1.2 Asynchronous SGD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.3 Achievements and goals . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2 Related Works 7 3 Throughput Estimation Model 9 3.1 Queueing Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 3.2 MeanValue Analysis(MVA) . . . . . . . . . . . . . . . . . . . . . . . . 11 3.2.1 Known Parameters . . . . . . . . . . . . . . . . . . . . . . . . . 12 3.2.2 Recursive Calculation . . . . . . . . . . . . . . . . . . . . . . . 12 3.2.3 Multiple Parameter Servers Calculation . . . . . . . . . . . . . . 14 4 Packet Analyzing Tools 15 4.1 PacketCapture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 4.2 DumpProcessor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 5 Evaluation 21 5.1 Local PC Clusters(CPU Only) . . . . . . . . . . . . . . . . . . . . . . . 22 5.1.1 different models and batch size . . . . . . . . . . . . . . . . . . . 23 5.1.2 limited network bandwidth . . . . . . . . . . . . . . . . . . . . . 26 5.2 Google Cloud Case(CPU and GPU) . . . . . . . . . . . . . . . . . . . . 28 5.2.1 different model and batch size . . . . . . . . . . . . . . . . . . . 29 5.2.2 limited network bandwidth . . . . . . . . . . . . . . . . . . . . . 31 5.3 Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 6 Conclusion 37 Bibliography 39 | |
dc.language.iso | en | |
dc.title | 建立使用非同步隨機梯度下降法的分散式訓練之多參數伺服器模型 | zh_TW |
dc.title | Multi-parameter-server modeling for distributed asynchronous SGD | en |
dc.type | Thesis | |
dc.date.schoolyear | 107-2 | |
dc.description.degree | 碩士 | |
dc.contributor.oralexamcommittee | 梁嘉德,廖婉君,呂政修,吳曉光 | |
dc.subject.keyword | 深度學習,深度神經網路,分散式機器學習,排隊網路,Tensorflow, | zh_TW |
dc.subject.keyword | Deep Learning,Deep Neural Networks,Distributed Machine Learning,Queueing Networks,TensorFlow, | en |
dc.relation.page | 41 | |
dc.identifier.doi | 10.6342/NTU201903787 | |
dc.rights.note | 有償授權 | |
dc.date.accepted | 2019-08-16 | |
dc.contributor.author-college | 電機資訊學院 | zh_TW |
dc.contributor.author-dept | 資訊網路與多媒體研究所 | zh_TW |
顯示於系所單位: | 資訊網路與多媒體研究所 |
文件中的檔案:
檔案 | 大小 | 格式 | |
---|---|---|---|
ntu-108-1.pdf 目前未授權公開取用 | 2.34 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。