Skip navigation

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets

Learn More
DSpace logo
English
中文
  • Browse
    • Communities
      & Collections
    • Publication Year
    • Author
    • Title
    • Subject
  • Search TDR
  • Rights Q&A
    • My Page
    • Receive email
      updates
    • Edit Profile
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊網路與多媒體研究所
Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/96088
Title: 基於三維快閃記憶體內運算加速巨量標籤分類模型
Accelerating Extreme Classification with In-Memory-Computing 3D-NAND Flash Memory
Authors: 鍾元皓
Yuan-Hao Zhong
Advisor: 楊佳玲
Chia-Lin Yang
Keyword: 記憶體內運算,三維快閃記憶體,極限大規模分類,
Computing-in-Memory,3D-NAND Flash,Extreme classification,
Publication Year : 2024
Degree: 碩士
Abstract: 極限分類涉及大量類別,已成為現代應用(如產品搜尋、推薦系統和語言模型)中的關鍵技術。隨著類別數量增加到百萬級,最終分類層的權重可以輕易達到數百GB,導致巨大的記憶體需求。傳統馮·諾伊曼架構中大量的權重數據移動導致了記憶體牆問題。

為了緩解這一問題,現有解決方案採用基於近似算法的存儲內計算技術,但它們在權重傳輸過程中仍存在額外的SSD內部數據移動,使進一步提高性能變得困難。

我們利用3D-NAND快閃記憶體的存內計算架構來克服這些挑戰。我們的架構可以提供更高的過濾率,以減少整體數據傳輸,同時消除低精度權重的傳輸。我們提出了一種軟硬體協同設計的方法,通過聚類數據放置和自適應門檻調整來提升極限分類的性能。聚類數據放置提高了我們的存內計算架構在執行過程中的效率。自適應門檻調整確保我們的系統在不同推理過程中保持預期的過濾率。

總體而言,與最先進的儲存體內處理基準相比,我們的研究作品達到了8.1倍的加速和6.2倍的能量節省。
Extreme classification, involving a vast number of categories, has become essential in modern applications like product search, recommendation systems and language models. As the number of categories increases to the million-scale level, the weight of the final classification layer can easily reach several hundred gigabytes, leading to enormous memory requirements. The massive movement of weight data in the traditional Von Neumann architecture results in a memory wall problem. To alleviate this problem, existing solution uses in-storage processing techniques based on approximate algorithms. However, this method still involves redundant data movement for weight transfer, making it difficult to achieve further performance improvements.
We leverage a computing-in-memory 3D-NAND flash memory to overcome these challenges. Our architecture can provide a higher filter rate to reduce overall data transfer while eliminating the transfer of low-precision weights. We present a co-designed software and hardware approach with clustering data placement and adaptive threshold adjustment to improve performance for extreme classification. Clustering data placement enhances the efficiency of our computing-in-memory architecture during execution. Adaptive threshold adjustment ensures that our system maintains an expected filter rate across different inferences. Overall, our paper achieves a significant 8.1x speedup and 6.2x energy savings compared with the state-of-the-art in-storage processing baseline.
URI: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/96088
DOI: 10.6342/NTU202404362
Fulltext Rights: 同意授權(限校園內公開)
Appears in Collections:資訊網路與多媒體研究所

Files in This Item:
File SizeFormat 
ntu-113-1.pdf
Access limited in NTU ip range
2.3 MBAdobe PDFView/Open
Show full item record


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved