請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/69954
完整後設資料紀錄
DC 欄位 | 值 | 語言 |
---|---|---|
dc.contributor.advisor | 陳良基 | |
dc.contributor.author | Yi-Kai Chen | en |
dc.contributor.author | 陳奕愷 | zh_TW |
dc.date.accessioned | 2021-06-17T03:35:51Z | - |
dc.date.available | 2023-03-02 | |
dc.date.copyright | 2018-03-02 | |
dc.date.issued | 2018 | |
dc.date.submitted | 2018-02-12 | |
dc.identifier.citation | [1] Russakovsky, Olga, et al. 'Imagenet large scale visual recognition challenge.' International Journal of Computer Vision 115.3 (2015): 211-252.
[2] Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. 'Imagenet classification with deep convolutional neural networks.' Advances in neural information processing systems. 2012. [3] Simonyan, Karen, and Andrew Zisserman. 'Very deep convolutional networks for large-scale image recognition.' arXiv preprint arXiv:1409.1556 (2014). [4] Lai, Liangzhen, Naveen Suda, and Vikas Chandra. 'Deep Convolutional Neural Network Inference with Floating-point Weights and Fixed-point Activations.' arXiv preprint arXiv:1703.03073 (2017). [5] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, A. Rabinovich, C. Hill, and A. Arbor, Going Deeper with Convolutions,' pp. 1-9, 2014. [6] He, Kaiming, et al. 'Deep residual learning for image recognition.' Proceedings of the IEEE conference on computer vision and pattern recognition. 2016. [7] Lawrence, Steve, et al. 'Face recognition: A convolutional neural-network approach.' IEEE transactions on neural networks 8.1 (1997): 98-113. [8] Schroff, Florian, Dmitry Kalenichenko, and James Philbin. 'Facenet: A unified embedding for face recognition and clustering.' Proceedings of the IEEE conference on computer vision and pattern recognition. 2015. [9] Abdel-Hamid, Ossama, et al. 'Applying convolutional neural networks concepts to hybrid NN-HMM model for speech recognition.' Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on. IEEE, 2012. [10] Kim, Yoon. 'Convolutional neural networks for sentence classification.' arXiv preprint arXiv:1408.5882 (2014). [11] Kalchbrenner, Nal, Edward Grefenstette, and Phil Blunsom. 'A convolutional neural network for modelling sentences.' arXiv preprint arXiv:1404.2188 (2014). [12] Hu, Baotian, et al. 'Convolutional neural network architectures for matching natural language sentences.' Advances in neural information processing systems. 2014. [13] Dong, Chao, et al. 'Learning a deep convolutional network for image super-resolution.' European Conference on Computer Vision. Springer, Cham, 2014. [14] Dong, Chao, Chen Change Loy, and Xiaoou Tang. 'Accelerating the super-resolution convolutional neural network.' European Conference on Computer Vision. Springer, Cham, 2016. [15] Shi, Wenzhe, et al. 'Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network.' Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016. [16] Le, Quoc V. 'Building high-level features using large scale unsupervised learning.' Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on. IEEE, 2013. [17] Coates, Adam, et al. 'Deep learning with COTS HPC systems.' International Conference on Machine Learning. 2013. [18] Jia, Yangqing, et al. 'Caffe: Convolutional architecture for fast feature embedding.' Proceedings of the 22nd ACM international conference on Multimedia. ACM, 2014. [19] Qiu, Jiantao, et al. 'Going deeper with embedded fpga platform for convolutional neural network.' Proceedings of the 2016 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. ACM, 2016. [20] Cavigelli, Lukas, and Luca Benini. 'Origami: A 803-gop/s/w convolutional network accelerator.' IEEE Transactions on Circuits and Systems for Video Technology 27.11 (2017): 2461-2475. [21] Gysel, Philipp, Mohammad Motamedi, and Soheil Ghiasi. 'Hardware-oriented approximation of convolutional neural networks.' arXiv preprint arXiv:1604.03168 (2016). [22] Samuel, Arthur L. 'Some studies in machine learning using the game of checkers.' IBM Journal of research and development 3.3 (1959): 210-229. [23] Lowe, David G. 'Distinctive image features from scale-invariant keypoints.' International journal of computer vision 60.2 (2004): 91-110. [24] Dalal, Navneet, and Bill Triggs. 'Histograms of oriented gradients for human detection.' Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on. Vol. 1. IEEE, 2005. [25] Chen, Yu-Hsin, et al. 'Eyeriss: An energy-efficient reconfigurable accelerator for deep convolutional neural networks.' IEEE Journal of Solid-State Circuits 52.1 (2017): 127-138. [26] Judd, Patrick, et al. 'Reduced-precision strategies for bounded memory in deep neural nets.' arXiv preprint arXiv:1511.05236 (2015). [27] Chen, Yunji, et al. 'Dadiannao: A machine-learning supercomputer.' Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture. IEEE Computer Society, 2014. [28] Chen, Tianshi, et al. 'Diannao: A small-footprint high-throughput accelerator for ubiquitous machine-learning.' ACM Sigplan Notices 49.4 (2014): 269-284. [29] Jouppi, Norman P., et al. 'In-datacenter performance analysis of a tensor processing unit.' Proceedings of the 44th Annual International Symposium on Computer Architecture. ACM, 2017. [30] Du, Zidong, et al. 'ShiDianNao: Shifting vision processing closer to the sensor.' ACM SIGARCH Computer Architecture News. Vol. 43. No. 3. ACM, 2015. [31] Zhang, Chen, et al. 'Optimizing fpga-based accelerator design for deep convolutional neural networks.' Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. ACM, 2015. [32] Li, Huimin, et al. 'A high performance FPGA-based accelerator for large-scale convolutional neural networks.' Field Programmable Logic and Applications (FPL), 2016 26th International Conference on. IEEE, 2016. [33] Sim, Jaehyeong, et al. '14.6 a 1.42 tops/w deep convolutional neural network recognition processor for intelligent ioe systems.' Solid-State Circuits Conference (ISSCC), 2016 IEEE International. IEEE, 2016. [34] Moons, Bert, and Marian Verhelst. 'A 0.3–2.6 TOPS/W precision-scalable processor for real-time large-scale ConvNets.' VLSI Circuits (VLSI-Circuits), 2016 IEEE Symposium on. IEEE, 2016. [35] Moons, Bert, et al. '14.5 envision: A 0.26-to-10tops/w subword-parallel dynamic-voltage-accuracy-frequency-scalable convolutional neural network processor in 28nm fdsoi.' Solid-State Circuits Conference (ISSCC), 2017 IEEE International. IEEE, 2017. [36] Anwar, Sajid, Kyuyeon Hwang, and Wonyong Sung. 'Fixed point optimization of deep convolutional neural networks for object recognition.' Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on. IEEE, 2015. [37] Gupta, Suyog, et al. 'Deep learning with limited numerical precision.' International Conference on Machine Learning. 2015. [38] Li, Fengfu, Bo Zhang, and Bin Liu. 'Ternary weight networks.' arXiv preprint arXiv:1605.04711 (2016). [39] Rastegari, Mohammad, et al. 'Xnor-net: Imagenet classification using binary convolutional neural networks.' European Conference on Computer Vision. Springer, Cham, 2016. [40] Courbariaux, Matthieu, et al. 'Binarized neural networks: Training deep neural networks with weights and activations constrained to+ 1 or-1.' arXiv preprint arXiv:1602.02830 (2016). | |
dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/69954 | - |
dc.description.abstract | 深度卷積神經網路的相關研究已經進行多年,並且在電腦視覺的領域得到驚人的成果,相關的應用也使我們的生活更加便利和直覺,然而先進的深度卷積神經網路通常包含百萬等級的參數和十億等級的算術運算,這使深度卷積神經網路無法有效率的在行動裝置和嵌入式系統中使用。
在這篇論文中,我們呈現一個深度卷積神經網路的特殊應用積體電路(ASIC)架構設計,目標是需要即時計算和低功率消耗的應用,在深度卷積神經網路的架構設計中有兩個主要的挑戰,第一個是在運算過程中需要大量的記憶體存取,第二個是由於資料分布和精準度不同的特性,會造成運算中產生不必要的功率消耗,因此降低記憶體的讀取和有效率的算術運算是架構設計的重點。 我們透過系統化的方法分析深度卷積神經網路中每一層所適合的資料重複使用型式,並且使用最佳的重複利用資料型式來降低記憶體存取的次數,我們也提出一個基於有符號數處理(sign and magnitude)的乘積累加運算(MAC)實作方法,並且驗證這種設計方法相比於傳統二補數的設計方法可以達到更低的功率消耗。 | zh_TW |
dc.description.provenance | Made available in DSpace on 2021-06-17T03:35:51Z (GMT). No. of bitstreams: 1 ntu-107-R04943022-1.pdf: 2789072 bytes, checksum: 514794e34767fa722bd90db0bb1f55e8 (MD5) Previous issue date: 2018 | en |
dc.description.tableofcontents | Abstract ix
1 Introduction 1 1.1 The Applications of Deep Convolutional Neural Networks . . 1 1.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.3 Thesis Organization . . . . . . . . . . . . . . . . . . . . . . . 4 2 Background 5 2.1 Machine Learning Overview . . . . . . . . . . . . . . . . . . 5 2.2 Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . 6 2.2.1 Modeling a neural . . . . . . . . . . . . . . . . . . . . 7 2.2.2 Backpropagation Algorithm . . . . . . . . . . . . . . 9 2.2.3 Deep Learning . . . . . . . . . . . . . . . . . . . . . . 9 3 Convolutional Neural Network 13 3.1 Overview and Core Concepts . . . . . . . . . . . . . . . . . . 13 3.2 Important Features . . . . . . . . . . . . . . . . . . . . . . . 13 3.3 Building Blocks . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.4 Popular Convolutional Neural Network Model . . . . . . . . 19 4 Architecture Design and Implementation of CNN Acceler- ator 23 4.1 Design Challenges and Considerations . . . . . . . . . . . . . 23 4.2 Proposed Hardware Architecture . . . . . . . . . . . . . . . 24 4.2.1 Reduce DRAM Access . . . . . . . . . . . . . . . . . 27 4.2.2 Filter Adapted Data ow . . . . . . . . . . . . . . . . 35 4.2.3 Energy Efficient Multiplier-Accumulator . . . . . . . 42 4.2.4 Architecture and Design Features . . . . . . . . . . . 43 4.2.5 Synthesis Results and Comparosion . . . . . . . . . . 46 5 Conclusion 49 Bibliography 50 | |
dc.language.iso | en | |
dc.title | 節能與可重組化深度卷積神經網路架構設計 | zh_TW |
dc.title | Architecture Design of Energy-Efficient Reconfigurable
Deep Convolutional Neural Network Accelerator | en |
dc.type | Thesis | |
dc.date.schoolyear | 106-1 | |
dc.description.degree | 碩士 | |
dc.contributor.oralexamcommittee | 蔡宗漢,劉宗德,楊佳玲 | |
dc.subject.keyword | 深度學習,深度卷積神經網路,資料重複使用設計,高幀率架構設計,晶片設計, | zh_TW |
dc.subject.keyword | deep learning,convolutional neural networks,data reuse,high frame rate architecture,chip design, | en |
dc.relation.page | 57 | |
dc.identifier.doi | 10.6342/NTU201800518 | |
dc.rights.note | 有償授權 | |
dc.date.accepted | 2018-02-12 | |
dc.contributor.author-college | 電機資訊學院 | zh_TW |
dc.contributor.author-dept | 電子工程學研究所 | zh_TW |
顯示於系所單位: | 電子工程學研究所 |
文件中的檔案:
檔案 | 大小 | 格式 | |
---|---|---|---|
ntu-107-1.pdf 目前未授權公開取用 | 2.72 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。