低功耗指令快取記憶體與高成本效益的電子束直寫系統之解碼器

Chun-Chang Yu; 余俊璋

Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/19731

Title:	低功耗指令快取記憶體與高成本效益的電子束直寫系統之解碼器 Low Power Instruction Cache and Cost-Effective Decoder of Electron-Beam Direct-Write Systems
Authors:	Chun-Chang Yu 余俊璋
Advisor:	陳中平(Chung-Ping Chen)
Keyword:	動態早期標籤查找,降低動態功耗,節能處理器,關聯指令緩存,多電子束直寫,數據壓縮,硬體解碼器, dynamic early tag lookup,energy-efficient,instruction cache,multiple electron-beam direct-write,hardware decoder,data compression,
Publication Year :	2021
Degree:	博士
Abstract:	此論文提出了一種節能，低成本的指令快取記憶體查找技術，稱為動態早期標籤查找（DETL）的方法。DETL利用閒置的週期對快取記憶體的索引執行早期的標籤查找。提早獲得配對記憶體的資訊，可以節省用於並行存取其他快取記憶體的動態功耗。我們在RISC-V微結構中的四路集關聯指令快取記憶體上實現 DETL，並使用SPEC CPU2006基準套件測試其性能。我們觀察到動態功耗降低 19.38％，而成本增加不到 0.1％。數據流量是多電子束直寫（MEBDW）系統中的一項關鍵指標，因此需要高效能的數據處理設備。主要的挑戰是如何通過具有成本效益的技術來實現高性能。此論文提出了一種高壓縮率的數據傳輸演算法和高速解壓縮的硬體實現方法，以提高系統的數據流量。硬體解碼器使用管道體系結構，運用長度編碼先進先出（FIFO）暫存陣列和並行調度邏輯來提高數據流量。該解碼器在FPGA上進行評估，並使用此論文提出的壓縮演算法所壓縮的佈局圖像進行模擬。結果顯示，與以前的方法相比，在類似的硬體成本下，壓縮率提高了 18.2％，數據流量提高了 254.8％。由於在設計中不使用靜態隨機存取記憶體（SRAM），因此可以輕鬆擴展系統的通道數，這使下一代MEBDW系統有可能實現更高的每小時晶圓（WPH）製造目標。 Power consumption is the most important issue of modern processors. Instruction fetch activity cooperating with instruction cache is an energy hot spot. Dynamic power reduction in instruction cache lookup can contribute to the improvement of energy efficiency in processors. An energy-efficient and low area-overhead instruction cache lookup technique called Dynamic Early Tag Lookup (DETL) is proposed. DETL exploits a fetch bubble cycle to perform an early tag lookup for the index of the matching cache set. Therefore, the dynamic energy for parallel accesses of other cache memory banks may be saved. We implement DETL on a four-way set-associative I-cache in a RISC-V micro-architecture, and test its performance using the SPEC CPU2006 benchmark suite. We observed a 19.38% dynamic power reduction with < 0.1% area overhead. Data throughput is a critical metric in a Multiple Electron-Beam Direct-Write (MEBDW) system so that heavy-duty data processing equipment is required. The main challenge is about how to achieve high performance with cost-effective techniques. In this dissertation, we propose a high compression rate algorithm for efficient data transfer and high-speed decompression hardware to raise data throughput of the system. The hardware decoder uses pipeline architecture, a run-length encoding First-In-First-Out (FIFO) queue, and parallel dispatch logic to increase the throughput. The decoder is evaluated on Field-Programmable Gate Array (FPGA) and simulated with layout images that are compressed using our proposed compression software. The results demonstrate 18.2% better compression rate and 254.8% better throughput than the previous work with similar hardware cost. Because no Static Random-Access Memory (SRAM) is used in the design, the channel number of the system can be easily scaled up, which makes it possible for the next-generation MEBDW system to achieve higher Wafer Per Hour (WPH) targets.
URI:	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/19731
DOI:	10.6342/NTU202100624
Fulltext Rights:	未授權
Appears in Collections:	電子工程學研究所

Files in This Item:

File	Size	Format
U0001-0602202100024700.pdf Restricted Access	2.58 MB	Adobe PDF

Show full item record

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets