植基於現場可程式化邏輯閘陣列晶片之高效重置乘法器與集成型記憶體分配架構

Chia-Chen Yen; 嚴家成

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/82082

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	陳銘憲(Ming-Syan Chen)
dc.contributor.author	Chia-Chen Yen	en
dc.contributor.author	嚴家成	zh_TW
dc.date.accessioned	2022-11-25T05:35:29Z	-
dc.date.available	2027-02-10
dc.date.copyright	2022-03-02
dc.date.issued	2022
dc.date.submitted	2022-02-12
dc.identifier.citation	[1] A. M. S. Abdelhadi and G. G. F. Lemieux. A multi-ported memory compiler utilizing true dual-port brams. In International Symposium on FieldProgrammable Custom Computing Machines (FCCM), pages 140–147. IEEE, 2016. [2] Ameer M. S. Abdelhadi. Architecture of blockRAMbased massively parallel mem ory structures: multiported memories and contentaddressable memories. PhD thesis, University of British Columbia, Vancouver, 2016. [3] Ameer M. S. Abdelhadi and Guy G. F. Lemieux. Modular switched multi-ported sram-based memories. ACM Transactions on Reconfigurable Technology and Sys tems, 9(3):1–26, 2016. [4] L. Aksoy, E. O. Gunes, and P. Flores. An exact breadth-first search algorithm for the multiple constant multiplications problem. In NORCHIP, pages 41–46. IEEE, 2008. [5] Levent Aksoy, Ece Olcay Güneş, and Paulo Flores. Search algorithms for the multiple constant multiplications problem: Exact and approximate. Microprocessors and Microsystems, 34(5):151–162, 2010. [6] D. Alpert and D. Avnon. Architecture of the pentium microprocessor. IEEE Micro, 13(3):11–21, 1993. [7] L. Aksoy amd E. da Costa, P. Flores, and J. Monteiro. Exact and approximate algorithms for the optimization of area and delay in multiple constant multiplications. IEEE Transactions on ComputerAided Design of Integrated Circuits and Systems, 27(6):1013–1026, 2008. [8] A. Avizienis. Signed-digit number representations for fast parallel arithmetic. IEEE Transactions on Electronic Computers, 10(3):389–400, 1961. [9] S. U. Bhandari, S. Subbaraman, S. Pujari, and R. Mahajan. Real time video processing on fpga using on the fly partial reconfiguration. In International Conference on Signal Processing Systems, pages 244–247. IEEE, 2009. [10] Andrew Donald Booth. A signed binary multiplication technique. Quarterly Journal of Mechanics and Applied Mathematics, 4:236–240, 1951. [11] Stephen D Brown, Robert J Francis, Jonathan Rose, and Zvonko G Vranesic. Field Programmable Gate Arrays. Springer Science Business Media, 1992. [12] D. R. Bull and D. H. Horrocks. Primitive operator digital filters. IEE Proceedings G (Circuits, Devices and Systems), 138(3):401–412, 1991. [13] P. R. Cappello and K. Steiglitz. Some complexity issues in digital signal processing. IEEE Transactions on Acoustics, Speech, and Signal Processing, 32(5):1037–1041, 1984. [14] X. Chang, R. Xia, J. K. Muppala, K. S. Trivedi, and J. Liu. Effective modeling approach for iaas data center performance analysis under heterogeneous workload. IEEE Transactions on Cloud Computing, 6(4):991–1003, 2018. [15] Barbara A. Chappell, Terry I. Chappell, Mahmut K. Ebcioglu, and Stanley E. Schuster. Virtual multi-port ram employing multiple accesses during single machine cycle, U.S. Patent 5 204 841, Nov. 1994. [16] J. Choi, K. Nam, A. Canis, J. Anderson, S. Brown, and T. Czajkowski. Impact of cache architecture and interface on performance and area of fpga-based processor/parallel accelerator systems. In IEEE 20th International Symposium on Field Programmable Custom Computing Machines, pages 17–24. IEEE, 2012. [17] Yujeong Choi and Minsoo Rhu. Prema: A predictive multi-task scheduling algorithm for preemptible neural processing units. In IEEE International Symposium on High Performance Computer Architecture (HPCA), pages 220–233. IEEE, 2020. [18] Altera Corporation. Memory blocks in stratix v devices. http://www.altera.com/literature/hb/stratix-v/stx5_51003.pdf. [19] M. Cummings and S. Haruyama. Fpga in the software radio. IEEE Communications Magazine, 37(2):108–112, 1999. [20] Zefu Dai. ApplicationDriven Memory System Design on FPGAs. LAP Lambert Academic Publishing, 2014. [21] S. S. Demirsoy, A. G. Dempster, and I. Kale. Design guidelines for reconfigurable multiplier blocks. In Proceedings of the 2003 International Symposium on Circuits and Systems (ISCAS), pages 293–296. IEEE, 2003. [22] A. Dempster and M. D. Macleod. Constant integer multiplication using minimum adders. IEE Proceedings Circuits, Devices and Systems, 141(5):407–413, 1994. [23] A. G. Dempster and M. D. Macleod. Use of minimum adder multiplier blocks in fir digital filters. IEEE Transactions on Circuits and Systems II: Analog and Digital Signal Processing, 42(9):569–577, 1995. [24] Jang-Jong Fan and Keh-Yih Su. An efficient algorithm for matching multiple patterns. IEEE Transactions on Knowledge and Data Engineering, 5(2):339–351, 1993. [25] M. Faust and C. Chang. Minimal logic depth adder tree optimization for multiple constant multiplication. In Proceedings of 2010 IEEE International Symposium on Circuits and Systems, pages 457–460. IEEE, 2010. [26] O. Gustafsson. A difference based adder graph heuristic for multiple constant multiplication problems. In International Symposium on Circuits and Systems, pages 1097–1100. IEEE, 2007. [27] O. Gustafsson and A. Dempster. On the use of multiple constant multiplication in polyphase fir filters and filter banks. In Proceedings of the 6th Nordic Signal Pro cessing Symposium (NORSIG), pages 53–56. IEEE, 2004. [28] Song Han, Huizi Mao, and William J. Dally. Deep compression: Compressing deep neural network with pruning, trained quantization and huffman coding. In 4th Inter national Conference on Learning Representations (ICLR), pages 1–14, 2016. [29] Martin Hardieck, Martin Kumm, Konrad Möller, and Peter Zipf. Reconfigurable convolutional kernels for neural networks on fpgas. In Proceedings of the 2019 ACM/SIGDA International Symposium on FieldProgrammable Gate Arrays, pages 43–52. ACM, 2019. [30] Kai Hwang. Computer Arithmetic: Principles, Architecture, and Design. John Wiley Sons, 1979. [31] T. Jain, P. Bansod, C. B. S. Kushwah, and M. Mewara. Reconfigurable hardware for median filtering for image processing applications. In 3rd International Conference on Emerging Trends in Engineering and Technology, pages 172–175. IEEE, 2010. [32] Richard M. Karp. Reducibility among combinatorial problems. In Proceedings of a symposium on the Complexity of Computer Computations, pages 85–103. Springer, 1972. [33] Natalio Krasnogor. Towards robust memetic algorithms. Recent Advances in Memetic Algorithms, 166:185–207, 2004. [34] E. S. Kumar, M. S. Joshi, J. Manikandan, and V. K. Agrawal. Design and evaluation of reconfigurable filters. In International Conference on Wireless Communications, Signal Processing and Networking (WiSPNET), pages 2021–2026. IEEE, 2017. [35] M. Kumm, P. Zipf, M. Faust, and C. Chang. Pipelined adder graph optimization for high speed multiple constant multiplication. In International Symposium on Circuits and Systems (ISCAS), pages 49–52. IEEE, 2012. [36] Martin Kumm. Multiple Constant Multiplication Optimizations for Field Pro grammable Gate Arrays. Springer Vieweg, 2016. [37] Charles Eric Laforest, Zimo Li, Tristan O’rourke, Ming G. Liu, and J. Gregory Steffan. Composing multi-ported memories on fpgas. ACM Transactions on Reconfig urable Technology and Systems, 7(3):1–23, 2014. [38] Charles Eric LaForest and J. Gregory Steffan. Efficient multi-ported memories for fpgas. In Proceedings of the 18th annual ACM/SIGDA international symposium on Field programmable gate arrays, pages 41–50. ACM, 2010. [39] Zhiqiang Liu, Yong Dou, Jingfei Jiang, Jinwei Xu, Shijie Li, Yongmei Zhou, and Yingnan Xu. Throughput-optimized fpga accelerator for deep convolutional neural networks. ACM Transactions on Reconfigurable Technology and Systems, 10(3):1– 23, 2017. [40] D. Llamocca, M. Pattichis, and G. A. Vera. A dynamically reconfigurable platform for fixed-point fir filters. In International Conference on Reconfigurable Computing and FPGAs, pages 332–337. IEEE, 2009. [41] Multiport memory system. H. yokota, U.S. Patent 4 930 066, May 1990. [42] U. Meyer-Baese. Digital Signal Processing with Field Programmable Gate Arrays. Springer, 2014. [43] U. I. Minhas, R. Woods, D. S. Nikolopoulos, and G. Karakonstantis. Efficient, dynamic multi-task execution on fpga-based computing systems. IEEE Transactions on Parallel and Distributed Systems, 33(3):710–722, 2022. [44] Mokhles A. Mohsin and Darshika G. Perera. An fpga-based hardware accelerator for k-nearest neighbor classification for machine learning on mobile devices. In Proceedings of the 9th International Symposium on HighlyEfficient Accelerators and Reconfigurable Technologies, pages 1–7. ACM, 2018. [45] K. Möller, M. Kumm, M. Kleinlein, and P. Zipf. Pipelined reconfigurable multiplication with constants on fpgas. In International Conference on Field Programmable Logic and Applications (FPL), pages 1–6. IEEE, 2014. [46] K. Möller, M. Kumm, M. Kleinlein, and P. Zipf. Reconfigurable constant multiplication for fpgas. IEEE Transactions on ComputerAided Design of Integrated Circuits and Systems, 36(6):927–937, 2017. [47] Alan V. Oppenheim and Ronald W. Schafer. DiscreteTime Signal Processing. Prentice Hall Inc., 2010. [48] Keshab K. Parhi. VLSI Digital Signal Processing Systems: Design and Implemen tation. Wiley, 1999. [49] D. G. Perera and K. F. Li. Fpga-based reconfigurable hardware for compute intensive data mining applications. In International Conference on P2P, Parallel, Grid, Cloud and Internet Computing, pages 100–108. IEEE, 2011. [50] D. G. Perera and K. F. Li. Analysis of single-chip hardware support for mobile and embedded applications. In IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (PACRIM), pages 369–376. IEEE, 2013. [51] Darshika Gimhani Perera. ChipLevel and reconfigurable hardware for data mining applications. PhD thesis, University of Victoria, 2012. [52] M. Potkonjak, M. B. Srivastava, and A. P. Chandrakasan. Multiple constant multiplications: efficient and versatile framework and algorithms for exploring common subexpression elimination. IEEE Transactions on ComputerAided Design of Inte grated Circuits and Systems, 15(2):151–165, 1996. [53] Russell. What is a block ram (bram) in an fpga? tutorial for beginners. https://www.nandland.com/articles/block-ram-in-fpga.html. [54] P. Sedcole, B. Blodget, T. Becker, J. Anderson, and P. Lysaght. Modular dynamic reconfiguration in virtex fpgas. IEE Proceedings Computers and Digital Techniques, 153(3):157–164, 2006. [55] S. N. Shahrouzi, A. Alkamil, and D. G. Perera. Towards composing optimized bi-directional multi-ported memories for next-generation fpgas. IEEE Access, 8:91531– 91545, 2020. [56] S. N. Shahrouzi and D. G. Perera. An efficient embedded multi-ported memory architecture for next-generation fpgas. In IEEE 28th International Conference on Applicationspecific Systems, Architectures and Processors (ASAP), pages 83–90. IEEE, 2017. [57] S. N. Shahrouzi and D. G. Perera. Optimized counter-based multi-ported memory architectures for next-generation fpgas. In 2018 31st IEEE International Systemon Chip Conference (SOCC), pages 106–111. IEEE, 2018. [58] S. Navid Shahrouzi and Darshika G. Perera. Dynamic partial reconfigurable hardware architecture for principal component analysis on mobile and embedded devices. EURASIP Journal on Embedded Systems, (25):1–18, 2017. [59] S. Navid Shahrouzi and Darshika G. Perera. Optimized hardware accelerators for data mining applications on embedded platforms: Case study principal component analysis. Microprocessors and Microsystems, 65:79–96, 2019. [60] Seyed Navid Shahrouzi. Optimized embedded and reconfigurable hardware archi tectures and techniques for data mining applications on mobile devises. PhD thesis, University of Colorado Colorado Springs, 2018. [61] Vivienne Sze, Yu-Hsin Chen, Tien-Ju Yang, and Joel S. Emer. Efficient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE, 105(12):2295–2329, 2017. [62] J. Thong and N. Nicolici. Combined optimal and heuristic approaches for multiple constant multiplication. In International Conference on Computer Design, pages 266–273. IEEE, 2010. [63] K. D. Tocher. Techniques of multiplication and division for automatic binary computers. Quarterly Journal of Mechanics and Applied Mathematics, 11:364–384, 1958. [64] Salim Ullah, Semeen Rehman, Bharath Srinivas Prabakaran, Florian Kriebel, Muhammad Abdullah Hanif, Muhammad Shafique, and Akash Kumar. Area-optimized low-latency approximate multipliers for fpga-based hardware accelerators. In 55th ACM/ESDA/IEEE Design Automation Conference (DAC), pages 1–6, 2018. [65] T. Ulversoy. Software defined radio: Challenges and opportunities. IEEE Commu nications Surveys Tutorials, 12(4):531–550, 2010. [66] Yevgen Voronenko and Markus Püschel. Multiplierless multiple constant multiplication. ACM Transactions on Algorithms, 3(2):1–38, 2007. [67] Henry Wong, Vaughn Betz, and Jonathan Rose. Comparing fpga vs. custom cmos and the impact on processor microarchitecture. In Proceedings of the 19th ACM/ SIGDA international symposium on Field programmable gate arrays, pages 5–14. ACM, 2011. [68] Yuanbin Wu and Jinwen Li. The design of digital radar receivers. IEEE Aerospace and Electronic Systems Magazine, 13(1):35–41, 1998. [69] Tianyu Zhang, Tiantian Han, Lu Tian, Yi Li, Xijie Jia, Guangdong Liu, Pingbo An, Yingran Tan, Lingzhi Sui, Shaoxie Fang, Dongliang Xie, Michaela Blott, and Yi Shan. Lpac: a low-precision accelerator for cnn on fpgas. In Proceedings of the 2020 ACM/SIGDA International Symposium on FieldProgrammable Gate Arrays, page 316. ACM, 2020.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/82082	-
dc.description.abstract	數位訊號處理(DSP)在現今電子系統如行動通信、自駕車控制單元等扮演著重要的角色。其中廣泛使用的計算作業為多常數乘法(MCMs)，使用的情境包含有數位濾波器與離散轉換。在特定應用積體電路(ASIC)廣泛運用的同時，有越來越多的高性能數位訊號處理系被大量實作於現場可程式化邏輯閘陣列晶片(FPGAs)中，這使得許多研究專注在簡化多常數乘法器的複雜度。然而因現場可程式化邏輯閘陣列晶片本身計算資源的限制，當這些數位訊號處理應用程式在使用多常數乘法作業時，需要頻繁地重置加速器內部的乘法係數，然而至今仍沒有相關的研究專注在一致性硬體拓撲的前提下發展多常數乘法器，來避免耗時的重置時間。此外，這些數位訊號處理系統通常需要可提供高度平行化的記憶體架構來保證它們對資料存取的一致性，雖然現今的現場可程式化邏輯閘陣列晶片提供了雙埠記憶體區塊，但若將這些記憶體區塊實作成多埠記憶體來提供平行化的存取，將會造成大量的記憶體消耗。在本論文中，我們針對需要頻繁重置乘法器區塊的應用程式提出了一個名為一元化多常數乘法(UMCM)的問題，並同時提出了一個名為兼容性之圖合成的平台，來有效率地建構一元化多常數乘法器。在該平台中，僅靠一組對數位移器之參數集，即可快速地動態重置乘法器區塊中的係數，避免使用現場可程式化邏輯閘陣列晶片所提供的重置技術，而增加系統的回應時間。根據實驗結果顯示，我們所提出的一元化多常數乘法技術非常適用於因硬體計算資源限制而需要頻繁重置乘法器的數位訊號處理應用程式。此外，為了確保這些數位訊號處理系統擁有對資料存取的一致性，在本論文中，我們同樣提出了一種可支援多埠切換的平行化記憶體存取架構，稱之為IMPC，來解決大量的記憶體與邏輯單元消耗的問題。該架構預先定義好一組軟/硬式存取埠的記憶體單元，再利用最小集合配置(MSP)問題來尋找一組使用最少記憶體單元的組合，來達到上述目的，根據實驗結果顯示，我們所提出的可支援多埠切換的記憶體存取架構可有效降低記憶體與相關計算資源的消耗。	zh_TW
dc.description.provenance	Made available in DSpace on 2022-11-25T05:35:29Z (GMT). No. of bitstreams: 1 U0001-1002202214223600.pdf: 4112020 bytes, checksum: 7064343884f2d934903ff382761ff53d (MD5) Previous issue date: 2022	en
dc.description.tableofcontents	誌謝 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii 摘要 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii List of Figures. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.2 Goal and Contributions of the Dissertation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.3 Organization of the Dissertation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2 Preliminary Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.1 Field Programmable Gate Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.1.1 Basic Logic Elements (BLEs) in Xilinx FPGAs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.1.2 Block Random Access Memories in FPGAs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.2 Constant Multiplications on FPGAs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.2.1 Single Constant Multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.2.1.1 Binary Multiplication. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.2.1.2 Multiplication using Signed Digit Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.2.2 Multiple Constant Multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.2.2.1 Adder Graph Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.2.2.2 The MCM Problem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.2.2.3 Mapping of Adder Graphs to FPGAs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 2.3 RAM Multi-Porting Distribution Techniques on FPGAs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 2.3.1 Multi-Read BRAM: Bank Replication. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 2.3.2 Multi-Write RAM: MultiBanking. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 2.3.3 Time-Multiplexing RAM: MultiPumping. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 3 Unified Multiple Constant Multipliers for Dynamic Exchange of Low-Precision Kernels on FPGAs . . . . . . . . 27 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 3.2 Preliminary and Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 3.2.1 Adder Graph Representation of Multiplier Block . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 3.2.2 Odd Fundamental Graph Transformation The MCM Framework . . . . . . . . . . . . . . . . . . . . . . . . 31 3.2.3 Bull-Horrocks algorithm (BHA). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 3.2.4 Bull-Horrocks modified algorithm (BHM) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 3.2.5 n-Dimensional reduced adder graph (RAGn) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 3.2.6 Heuristic basedon Cumulative Benefit (Hcub). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 3.3 The Unified MCM Problem and The Compatible Graph Synthesis Framework . . . . . . . . . . . . . . . . . . . 38 3.3.0.1 Equivalent Index Evaluation (EIE). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 3.3.0.2 Congruous Interpolation (CI) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 3.4 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 3.4.1 Agraph Synthesis Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 3.4.2 Multiplication Time Overhead and Average Throughput. . . . . . . . . . . . . . . . . . . . . . . . . . . 46 3.4.3 Partial Reconfiguration Overhead . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 3.5 Summary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 4 Integrated Multi-Ported Memory Distribution for Temporal/Spatial-Multiplexing Parallel Workloads on FPGAs. . 50 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 4.2 Preliminary and Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 4.2.1 Temporal/SpatialMultiplexing Parallel Workloads. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 4.2.2 Conventional MultiPorted BRAM Distribution Schemes . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 4.2.2.1 Uni-directional Distribution Scheme. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 4.2.2.2 Bi-directional Distribution Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 4.2.2.3 Multi-Switched-Porting Distribution Scheme (MSPDS) . . . . . . . . . . . . . . . . . . . . . . . . . . 59 4.2.3 Hard/Soft Porting BRAM Configurations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 4.2.4 Graph Modelling for Uni/Bidirectional Schemes and MSPDS. . . . . . . . . . . . . . . . . . . . . . . . . 63 4.3 Integrated Multi-Porting Configuration for BRAM Distribution . . . . . . . . . . . . . . . . . . . . . . . 65 4.3.1 Write/Read Coexistence Graph with (pq)-Cliques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 4.3.2 Integrated Multi-Porting Optimization on WRCG. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 4.4 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 4.4.1 Analysis of BRAM Consumption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 4.4.2 Space Analysis - Number of Occupied LUTs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 4.5 Summary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
dc.language.iso	en
dc.subject	多埠記憶體	zh_TW
dc.subject	現場可程式化邏輯閘陣列	zh_TW
dc.subject	多常數乘法	zh_TW
dc.subject	一元化多常數乘法	zh_TW
dc.subject	FPGA	en
dc.subject	MCM	en
dc.subject	UMCM	en
dc.subject	Multi-Ported Memory	en
dc.title	植基於現場可程式化邏輯閘陣列晶片之高效重置乘法器與集成型記憶體分配架構	zh_TW
dc.title	Effective Reconfigurable Multiplication and Integrated Memory Distribution on FPGAs	en
dc.date.schoolyear	110-1
dc.description.degree	博士
dc.contributor.author-orcid	0000-0002-4286-4806
dc.contributor.oralexamcommittee	葉彌妍(Yi-Chen Lo),張原豪(Bu-Miin Huang),修丕承(Ching-Chuan Kuo),賴冠廷(Yuan-Soon Ho)
dc.subject.keyword	現場可程式化邏輯閘陣列,多常數乘法,一元化多常數乘法,多埠記憶體,	zh_TW
dc.subject.keyword	FPGA,MCM,UMCM,Multi-Ported Memory,	en
dc.relation.page	90
dc.identifier.doi	10.6342/NTU202200528
dc.rights.note	同意授權(限校園內公開)
dc.date.accepted	2022-02-12
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	電機工程學研究所	zh_TW
dc.date.embargo-lift	2027-02-14	-
顯示於系所單位：	電機工程學系

文件中的檔案：

檔案	大小	格式
U0001-1002202214223600.pdf 未授權公開取用	4.02 MB	Adobe PDF	檢視/開啟

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。