Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊網路與多媒體研究所
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/83107
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor楊佳玲zh_TW
dc.contributor.advisorChia-Lin Yangen
dc.contributor.author賴宥儒zh_TW
dc.contributor.authorYou-Ru Laien
dc.date.accessioned2023-01-08T17:07:13Z-
dc.date.available2023-11-09-
dc.date.copyright2023-01-06-
dc.date.issued2022-
dc.date.submitted2022-11-25-
dc.identifier.citationG. Linden, B. Smith, and J. York, “Amazon.com recommendations: Item-to-item collaborative filtering,” 2003
B. Acun, M. Murphy, X. Wang, J. Nie, C. Wu, and K. Hazelwood, “Understanding training efficiency of deep learning recommendation models at scale,” in Proceedings of 2021 IEEE International Symposium on High-Performance Computer Architecture, HPCA ’21, IEEE Computer Society, 2021
P. Covington, J. Adams, and E. Sargin, “Deep neural networks for youtube recommendations,” in Proceedings of the 10th ACM Conference on Recommender Systems, RecSys ’16, Association for Computing Machinery, 2016
M. Zhao, N. Agarwal, A. Basant, B. Gedik, S. Pan, M. Ozdal, R. Komuravelli, J. Pan, T. Bao, H. Lu, S. Narayanan, J. Langman, K. Wilfong, H. Rastogi, C.-J. Wu, C. Kozyrakis, and P. Pol, “Understanding data storage and ingestion for large-scale deep recommendation model training: Industrial product,” in Proceedings of the 49th Annual International Symposium on Computer Architecture, ISCA ’22, Association for Computing Machinery, 2022
A. Eisenman, M. Naumov, D. Gardner, M. Smelyanskiy, S. Pupyrev, K. Hazelwood, A. Cidon, and S. Katti, “Bandana: Using non-volatile memory for storing deep learning models,” in Proceedings of 2nd Machine Learning and Systems conference, MLSys ’19, MLSys, 2019
W. Zhao, D. Xie, R. Jia, Y. Qian, R. Ding, M. Sun, and P. Li, “Distributed hierarchical gpu parameter server for massive scale deep learning ads systems,” in Proceedings of 3rd Machine Learning and Systems conference, MLSys ’20, MLSys, 2020
J. Axboe, A. D. Brunelle, and N. Scott, “blktrace(8) - linux man page.” https://linux.die.net/man/8/blktrace, 2006
J. Axboe, “fio - flexible i/o tester.” https://fio.readthedocs.io/en/latest/fio_doc.html, 2017
M. Naumov, D. Mudigere, H.-J. M. Shi, J. Huang, N. Sundaraman, J. Park, X. Wang, U. Gupta, C.-J. Wu, A. G. Azzolini, D. Dzhulgakov, A. Mallevich, I. Cherniavskii, Y. Lu, R. Krishnamoorthi, A. Yu, V. Kondratenko, S. Pereira, X. Chen, W. Chen, V. Rao, B. Jia, L. Xiong, and M. Smelyanskiy, “Deep learning recommendation model for personalization and recommendation systems,” 2019
D. Mudigere, Y. Hao, J. Huang, Z. Jia, A. Tulloch, S. Sridharan, X. Liu, M. Ozdal, J. Nie, J. Park, L. Luo, J. A. Yang, L. Gao, D. Ivchenko, A. Basant, Y. Hu, J. Yang, E. K. Ardestani, X. Wang, R. Komuravelli, C.-H. Chu, S. Yilmaz, H. Li, J. Qian, Z. Feng, Y. Ma, J. Yang, E. Wen, H. Li, L. Yang, C. Sun, W. Zhao, D. Melts, K. Dhulipala, K. Kishore, T. Graf, A. Eisenman, K. K. Matam, A. Gangidi, G. J. Chen, M. Krishnan, A. Nayak, K. Nair, B. Muthiah, M. khorashadi, P. Bhattacharya, P. Lapukhov, M. Naumov, A. Mathews, L. Qiao, M. Smelyanskiy, B. Jia, and V. Rao, “Software-hardware co-design for fast and scalable training of deep learning recommendation models,” in Proceedings of the 49th Annual International Symposium on Computer Architecture, ISCA ’22, Association for Computing Machinery, 2022
M. Gorman, “Swap management.” https://www.kernel.org/doc/gorman/html/understand/understand014.html, 2007
D. P. Bovet and M. Cesati, “Page frame reclaiming,” in Understanding the Linux Kernel, ch. 17, O’Reilly Media, 2005
SSDFans, “SSD 核心技術: FTL,” in 深入淺出 SSD: 固態存儲核心技術、原理與實戰, ch. 4, 機械工業出版社, 2018
CriteoLabs, “Download terabyte click logs.” https://labs.criteo.com/2013/12/download-terabyte-click-logs-2/, 2013
M. Adnan, Y. E. Maboud, D. Mahajan, and P. J. Nair, “Accelerating recommendation system training by leveraging popular choices,” in Proceedings of the VLDB Endowment, VLDB ’21, VLDB Endowment, 2021
D. Kalamkar, E. Georganas, S. Srinivasan, J. Chen, M. Shiryaev, and A. Heinecke, “Optimizing deep learning recommender systems training on cpu cluster architec tures,” in Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC ’20, IEEE Press, 2020
X. Sun, H. Wan, Q. Li, C. Yang, T. Kuo, and C. Xue, “Rm-ssd: In-storage computing for large-scale recommendation inference,” in Proceedings of 2022 IEEE International Symposium on High-Performance Computer Architecture, HPCA ’22, IEEE Computer Society, 2022
L. Ke, U. Gupta, B. Y. Cho, D. Brooks, V. Chandra, U. Diril, A. Firoozshahian, K. Hazelwood, B. Jia, H.-H. S. Lee, M. Li, B. Maher, D. Mudigere, M. Naumov, M. Schatz, M. Smelyanskiy, X. Wang, B. Reagen, C.-J. Wu, M. Hempstead, and X. Zhang, “Recnmp: Accelerating personalized recommendation with near-memory processing,” in Proceedings of the ACM/IEEE 47th Annual International Symposium on Computer Architecture, ISCA ’20, IEEE Press, 2020
P. Rubin, D. MacKenzie, and S. Kemp, “dd(1) - linux man page.” https://linux.die.net/man/1/dd, 2010.
M. Bjørling, J. González, and P. Bonnet, “Lightnvm: The linux open-channel ssd subsystem,” in Proceedings of the 15th Usenix Conference on File and Storage Technologies, FAST’17, USENIX Association, 2017
M. Bjørling, A. Aghayev, H. Holmberg, A. Ramesh, D. L. Moal, G. R. Ganger, and G. Amvrosiadis, “ZNS: Avoiding the block interface tax for flash-based SSDs,” in Proceedings of 2021 USENIX Annual Technical Conference, USENIX ATC ’21, USENIX Association, 2021
Y. Kwon, Y. Lee, and M. Rhu, “Tensordimm: A practical near-memory processing architecture for embeddings and tensor operations in deep learning,” in Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, MICRO ’52, Association for Computing Machinery, 2019
H. Wan, X. Sun, Y. Cui, C.-L. Yang, T.-W. Kuo, and C. J. Xue, “Flashembedding: Storing embedding tables in ssd for large-scale recommender systems,” in Proceedings of the 12th ACM SIGOPS Asia-Pacific Workshop on Systems, APSys ’21, Association for Computing Machinery, 2021.
M. Wilkening, U. Gupta, S. Hsia, C. Trippel, C.-J. Wu, D. Brooks, and G.-Y. Wei, “Recssd: Near data processing for solid state drive based recommendation inference,” in Proceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS ’21, Association for Computing Machinery, 2021.
D. Skourtis, D. Achlioptas, N. Watkins, C. Maltzahn, and S. Brandt, “Flash on rails: Consistent flash performance through redundancy,” in Proceedings of the 2014 USENIX Conference on USENIX Annual Technical Conference, USENIX ATC’14, USENIX Association, 2014
S. Yan, H. Li, M. Hao, M. H. Tong, S. Sundararaman, A. A. Chien, and H. S. Gunawi, “Tiny-Tail flash: Near-Perfect elimination of garbage collection tail latencies in NAND SSDs,” in Proceedings of 15th USENIX Conference on File and Storage Technologies, FAST ’17, USENIX Association, 2017
J. He, S. Kannan, A. C. Arpaci-Dusseau, and R. H. Arpaci-Dusseau, “The unwritten contract of solid state drives,” in Proceedings of the Twelfth European Conference on Computer Systems, EuroSys ’17, Association for Computing Machinery, 2017
-
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/83107-
dc.description.abstract推薦系統廣泛用於提供個人化的建議,而推薦系統訓練時所需記憶體容量持續成長。使用swap將固態硬碟 (SSD) 作為記憶體的延伸,可以緩解訓練時記憶體的需求。由於使用swap會引入額外的讀寫延遲,並且考慮到模型重新訓練及部署的週期性,訓練推薦系統時使用swap必須關注對整體效率的影響。

在本文中,我們觀察到使用swap會增加訓練時間達到 2 ~ 5 倍長。經過分析,我們歸納出以下影響訓練效率的原因:
1. 推薦系統訓練時操作記憶體是不規律的,造成記憶體利用率低且swap次數多,所以讀寫量很大。
2. 讀寫請求的大小多數小於32K,不利SSD內部頻寬的利用。
3. SSD的讀寫頻寬隨整體寫入量增加而下降,主要是受到SSD內部的垃圾回收機制影響。
同時,我們使用 fio 模擬讀寫行為並探討改善SSD讀寫效率的方式,實驗結果如下,1.改變讀寫大小至128KB,有1.75倍的頻寬提升;2.改變寫入模式為順序寫,寫性能有4.37倍的提升。
最後,我們提供了下列二個 swap 用於推薦系統訓練時 的建議:1.聚集更多鄰近使用的swap資料來以較大的讀寫大小操作及換出(swap out)記憶體時以順序寫的方式操作SSD。2.採用 Open Channel SSD 或 ZNS SSD 作為 swap 可讓系統依需求安排 SSD 的資料讀寫及垃圾回收機制,以此提升效能。
zh_TW
dc.description.abstractThe deep learning recommendation model(DLRM) is widely used for providing personalized suggestions, and the memory capacity requirement for DLRM training keeps growing. Using swap that turns SSD into a memory extension can alleviate the DRAM capacity demand of training. At the same time, it will introduce additional I/O latency, and considering the cycle time of model retrain and redeployment, training DLRM with swap needs to consider the influence on efficiency.

In this thesis, we find that the training time becomes 2 ~ 5 times longer when using swap. Based on the analysis, we summarize the factors that influence the training efficiency as follows. 1. The memory access pattern is irregular when DLRM training, which causes the utilization of memory to be low and the number of swapping to be large. Thus, the I/O volume is huge. 2. Most of the I/O requests are less than 32K, which is unfavorable for utilizing the internal bandwidth of SSD. 3. As the write volume increases, the SSD read/write bandwidth decreases, which is mainly affected by internal garbage collection(GC) task in SSD. Besides, we use fio to simulate the I/O behavior and conduct experiment on how to improve SSD I/O efficiency. The result is the following. 1. Changing the I/O request size to 128K leads to 1.75x bandwidth improvement. 2. Changing the write pattern to sequential write leads to 4.37$x improvement of write bandwidth. In the end, we provide 2 suggestions for using swap in DLRM training. 1. Aggregate more swap data that will be used in close time to read/write with a bigger size and use sequential write to swap out memory 2. Choosing the Open Channel SSD or ZNS SSD allows the host to arrange the read/write and GC according to the demand, thereby improving the performance.
en
dc.description.provenanceSubmitted by admin ntu (admin@lib.ntu.edu.tw) on 2023-01-08T17:07:13Z
No. of bitstreams: 0
en
dc.description.provenanceMade available in DSpace on 2023-01-08T17:07:13Z (GMT). No. of bitstreams: 0en
dc.description.tableofcontentsVerification Letter from the Oral Examination Committee i
Acknowledgements ii
摘要 iii
Abstract iv
Contents vi
List of Figures viii
List of Tables x
Chapter 1 Introduction 1
Chapter 2 Background 5
2.1 Deep Learning Recommendation Model(DLRM) 5
2.2 Linux Memory Management and Swapping System 7
2.3 SSD Architecture 8
Chapter 3 Motivation 11
3.1 DLRM Training Time and I/O Volume 11
Chapter 4 Workload Analysis 15
4.1 Experimental Setup 15
4.2 Access Pattern and Utilization of Swapped-in Page 18
4.3 I/O Behavior Analysis 22
4.4 Latency Analysis 31
4.5 FIO Simulation 39
4.6 Discussion 43
Chapter 5 Related works 45
5.1 DLRM Optimization 45
5.2 SSD Performance Optimization 48
Chapter 6 Conclusion 50
References 51
-
dc.language.isoen-
dc.subject置換系統zh_TW
dc.subject深度學習學習推薦系統zh_TW
dc.subject垃圾回收機制zh_TW
dc.subject讀寫特徵zh_TW
dc.subject固態硬碟zh_TW
dc.subjectGarbage Collectionen
dc.subjectDLRMen
dc.subjectSwapping systemen
dc.subjectSSDen
dc.subjectI/O characteristicen
dc.title深度學習推薦系統訓練之記憶體置換行為分析zh_TW
dc.titleA Swapping Behavior Analysis of Deep Learning Recommendation System Trainingen
dc.title.alternativeA Swapping Behavior Analysis of Deep Learning Recommendation System Training-
dc.typeThesis-
dc.date.schoolyear111-1-
dc.description.degree碩士-
dc.contributor.oralexamcommittee鄭湘筠;陳依蓉zh_TW
dc.contributor.oralexamcommitteeHsiang-Yun Cheng;Yi-Jung Chenen
dc.subject.keyword深度學習學習推薦系統,置換系統,固態硬碟,讀寫特徵,垃圾回收機制,zh_TW
dc.subject.keywordDLRM,Swapping system,SSD,I/O characteristic,Garbage Collection,en
dc.relation.page55-
dc.identifier.doi10.6342/NTU202210058-
dc.rights.note同意授權(限校園內公開)-
dc.date.accepted2022-11-25-
dc.contributor.author-college電機資訊學院-
dc.contributor.author-dept資訊網路與多媒體研究所-
顯示於系所單位:資訊網路與多媒體研究所

文件中的檔案:
檔案 大小格式 
U0001-1121221117414063.pdf
授權僅限NTU校內IP使用(校園外請利用VPN校外連線服務)
3.45 MBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved