友善於記憶體內運算的深度學習三維重建

Ting-Wei Chang; 張庭維

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/84691

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	施吉昇(Chi-Sheng Shih)
dc.contributor.author	Ting-Wei Chang	en
dc.contributor.author	張庭維	zh_TW
dc.date.accessioned	2023-03-19T22:20:46Z	-
dc.date.copyright	2022-10-08
dc.date.issued	2022
dc.date.submitted	2022-09-12
dc.identifier.citation	[1] “Typical CNN architecture,”https://commons.wikimedia.org/wiki/File:Typical_cnn.png,[Online; accessed on 26-Aug-2022]. [2] J.-J. Chou, T.-W. Chang, X.-Y. Liu, T.-Y. Wu, Y.-K. Chen, Y.-T. Hsu, C.-W. Chen, T.-T. Liu, and C.-S. Shih, “CIM-Based Smart Pose Detection Sensors,” Sensors, vol. 22, no. 9, 2022. [Online]. Available: https://www.mdpi.com/1424-8220/22/9/3491 [3] Z. Jiang, S. Yin, J.-S. Seo, and M. Seok, “C3SRAM: An In-Memory-Computing SRAM Macro Based on Robust Capacitive Coupling Computing Mechanism,” IEEE Journal of Solid-State Circuits, vol. 55, no. 7, pp. 1888–1897, 2020. [4] J. Zhang, Z. Wang, and N. Verma, “In-Memory Computation of a Machine-Learning Classifier in a Standard 6T SRAM Array,” IEEE Journal of Solid-State Circuits, vol. 52, no. 4, pp. 915–924, 2017. [5] A. Biswas and A. P. Chandrakasan, “CONV-SRAM: An Energy-Efficient SRAM With In-Memory Dot-Product Computation for Low-Power Convolutional Neural Networks,” IEEE Journal of Solid-State Circuits, vol. 54, no. 1, pp. 217–230, 2019. [6] Y.-T. Hsu, C.-Y. Yao, T.-Y. Wu, T.-D. Chiueh, and T.-T. Liu, “A high-throughput energy–area-efficient computing-in-memory sram using unified charge-processing network,” IEEE Solid-State Circuits Letters, vol. 4, pp. 146–149, 2021. [7] X. Si, J.-J. Chen, Y.-N. Tu, W.-H. Huang, J.-H. Wang, Y.-C. Chiu, W.-C. Wei, S.-Y. Wu, X. Sun, R. Liu, S. Yu, R.-S. Liu, C.-C. Hsieh, K.-T. Tang, Q. Li, and M.-F. Chang, “A Twin-8T SRAM Computation-in-Memory Unit-Macro for Multibit CNN-Based AI Edge Processors,” IEEE Journal of Solid-State Circuits, vol. 55, no. 1, pp. 189–202, 2020. [8] C.-Y. Yao, “A fully bit-flexible computation-in-memory macro using capacitor YinYang array operation and embedded input sparsity sensing,” Master’s thesis, Graduate Institute of Electronics Engineering, National Taiwan University, Taipei, Taiwan, Jan. 2022. [9] Z. Song, S. Tang, F. Gu, C. Shi, and J. Feng, “DOE-based structured-light method for accurate 3D sensing,” Optics and Lasers in Engineering, vol. 120, pp. 21–30, 2019. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0143816618317408 [10] D. Moreno and G. Taubin, “Simple, Accurate, and Robust Projector-Camera Calibration,” in 2012 Second International Conference on 3D Imaging, Modeling, Processing, Visualization and Transmission, 2012, pp. 464–471. [11] F. MacWilliams and N. Sloane, “Pseudo-random sequences and arrays,” Proceedings of the IEEE, vol. 64, no. 12, pp. 1715–1729, 1976. [12] M. Qin and D. Vucinic, “Training Recurrent Neural Networks against Noisy Computations during Inference,” CoRR, vol. abs/1807.06555, 2018. [Online]. Available: http://arxiv.org/abs/1807.06555 [13] B. Zhang, L.-Y. Chen, and N. Verma, “Stochastic Data-driven Hardware Resilience to Efficiently Train Inference Models for Stochastic Hardware Implementations,” in ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2019, pp. 1388–1392. [14] I. Hubara, M. Courbariaux, D. Soudry, R. El-Yaniv, and Y. Bengio, “Binarized Neural Networks,” in Advances in Neural Information Processing Systems, vol. 29. Curran Associates, Inc., 2016. [Online]. Available: https://proceedings.neurips.cc/paper/2016/file/d8330f857a17c53d217014ee776bfd50-Paper.pdf [15] R. Krishnamoorthi, “Quantizing deep convolutional networks for efficient inference: A whitepaper,” CoRR, vol. abs/1806.08342, 2018. [Online]. Available: http://arxiv.org/abs/1806.08342 [16] “Flir gs3-u3-41c6nir,” https://www.flir.asia/products/grasshopper3-usb3/?model=GS3-U3-41C6NIR-C, [Online; accessed on 4-Aug-2022]. [17] “Tc1220-12mp,” https://tokina.co.jp/en/security/machine-vision-lenses/tc1220-12mp.html, [Online; accessed on 4-Aug-2022]. [18] “Nec vt700,” https://tw.nec.com/zh_TW/products/VT700/index.html, [Online; accessed on 4-Aug-2022]. [19] Y.-S. Lin, “End to End 3D Reconstruction with CNN Method Using Grid Point Based Structured Light Pattern,” Master’s thesis, Graduate Institute of Networking and Multimedia, National Taiwan University, Taipei, Taiwan, Jul. 2022.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/84691	-
dc.description.abstract	現今有許多應用使用到三維模型，像是智慧型手機使用的人臉辨識或虛擬實境所使用的角色模型，三維模型可以提供比二維模型更好的安全性以及更生動的表現。然而三維模型往往需要許多的運算資源消耗許多的能量，這使得三維模型的使用無法是永遠開啟的。為了改善能量消耗的問題，運算記憶體的架構被提出，透過運算記憶體即可使能量消耗大幅減少，使三維感測器可以永遠開啟達到實時的效果。而要在記憶體內運算有許多限制，包括只能使用卷積運算、只提供整數的運算以及類比電路所產生的非線性誤差。為了解決這些記憶體內運算所帶來的限制，本論文，可以訓練出適用於運算記憶體的深度學習模型，包括輸入、輸出以及權重的量化、每層網路輸出的標準化、網路整數權重的更新以及解決類比電路所帶來的誤差。在訓練出能在記憶體內運算的網路後，三維重建的深度網路便可使用在記憶體內運算的系統上。本篇論文的三維重建結果誤差能在1釐米以內，此方法亦能在較為複雜的面具模型上重建三維模型，在此誤差範圍下運算記憶體的三維模型可以適用於各種應用上，像是智慧型手機的人臉辨識系統。有了運算記憶體的特性可以實現低能耗以及永遠開啟的感測裝置。	zh_TW
dc.description.abstract	Many 3D recognized systems use the 3D points cloud as input to improve the accuracy and security, such as face recognition used in smartphones or avatar models used in virtual reality. However, processing the 3D models often requires much computation, which consumes much energy and makes using 3D models impractical to be always-on. The Computing in Memory (CIM) architecture is proposed to reduce the energy consumption of the convolution operation. Through the CIM chip, the energy consumption can be significantly reduced so that the 3D sensor can be always-on. However, CIM operations have many limitations, including convolution-only operations, integer-only operations, and non-linearity errors caused by analog circuits. In order to solve the CIM limitations, this work proposes a set of procedures to train a deep learning model that is friendly to the CIM chip, including quantization of inputs, outputs, and weights, normalization of the network's output, weight updating in integer precision, and overcoming the error caused by analog circuits. After training the CIM-friendly network, the 3D reconstructed deep learning network can be used on the CIM system. The distance errors of the 3D reconstruction results in this work can achieve less than 1 cm. This method can reconstruct the 3D model on the more complex mask model. Within this error range, CIM 3D reconstruction can be applied to various applications, such as face ID systems for smartphones. The 3D reconstruction network for CIM can achieve low power consumption and be always-on sensing so that the application of 3D models can be more widely used in various devices.	en
dc.description.provenance	Made available in DSpace on 2023-03-19T22:20:46Z (GMT). No. of bitstreams: 1 U0001-2208202210520400.pdf: 6501112 bytes, checksum: 84f294f2d46094f3aaaa88db41569ae3 (MD5) Previous issue date: 2022	en
dc.description.tableofcontents	Contents 口試委員會審定書 i 致謝 iv 摘要 v Abstract vi 1 Introduction 1 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.3 Thesis Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2 Background and Related Works 5 2.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.1.1 Analog SRAM Computing in Memory . . . . . . . . . . . . . . 5 2.1.2 Depth Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.1.3 Structured Light Pattern . . . . . . . . . . . . . . . . . . . . . . 10 2.1.4 Convolutional Neural Network . . . . . . . . . . . . . . . . . . . 12 2.2 Related Works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 3 System Architecture and Problem Definition 15 3.1 System Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.2 Problem Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 3.3 Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 4 Design and Implementation 20 4.1 System Design Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 4.2 Training the CIM-friendly Network . . . . . . . . . . . . . . . . . . . . 21 4.2.1 Networks Architecture . . . . . . . . . . . . . . . . . . . . . . . 22 4.2.2 Normalization . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 4.2.3 Quantization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 4.2.4 Weights Update . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 4.2.5 CIM non-linearity . . . . . . . . . . . . . . . . . . . . . . . . . 28 4.3 CIM-friendly Inference . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 5 Performance Evaluation 33 5.1 Experiment Setting and Performance Metrics . . . . . . . . . . . . . . . 33 5.2 Evaluation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 6 Conclusion 44 Bibliography 45 List of Figures 1.1 The Schematic of von Neumann bottleneck . . . . . . . . . . . . . . . . 2 2.1 CIM SRAM Chip Framework . . . . . . . . . . . . . . . . . . . . . . . 6 2.2 The Transfer Curve of the non-linearity Error on CIM . . . . . . . . . . . 7 2.3 CIM operation overall . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.4 Estimating 3D Coordinates by Triangulation . . . . . . . . . . . . . . . . 9 2.5 Triangulation in Structure Light System . . . . . . . . . . . . . . . . . . 10 2.6 Structured Light Pattern on Object . . . . . . . . . . . . . . . . . . . . . 11 2.7 Types of Structured Light Pattern . . . . . . . . . . . . . . . . . . . . . . 11 2.8 CNN architectures [1] . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 3.1 Overall System Architecture of CIM-friendly 3D reconstruction . . . . . 16 3.2 Physical System Configuration . . . . . . . . . . . . . . . . . . . . . . . 17 3.3 Overall Framework of Smart Phone Face ID . . . . . . . . . . . . . . . . 17 4.1 System Design Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 4.2 Network for 3D Reconstruction on CIM . . . . . . . . . . . . . . . . . . 23 4.3 Lost Value after Quantization . . . . . . . . . . . . . . . . . . . . . . . . 26 4.4 Straight Through Estimator . . . . . . . . . . . . . . . . . . . . . . . . . 27 4.5 Simulate CIM Operation on General Device . . . . . . . . . . . . . . . . 30 4.6 Post Processing for 3D Reconstruction . . . . . . . . . . . . . . . . . . . 32 5.1 3D Reconstruction Objects . . . . . . . . . . . . . . . . . . . . . . . . . 34 5.2 The masks to 3D reconstruction . . . . . . . . . . . . . . . . . . . . . . 36 5.3 3D Reconstruction with Plane . . . . . . . . . . . . . . . . . . . . . . . 37 5.4 3D Reconstruction with Polyhedron Planes . . . . . . . . . . . . . . . . 37 5.5 The 3D reconstruction result of masks . . . . . . . . . . . . . . . . . . . 42 5.6 The 3D reconstruction result of masks . . . . . . . . . . . . . . . . . . . 43 List of Tables 4.1 Influence of the standard deviation . . . . . . . . . . . . . . . . . . . . . 25 4.2 Influence of the Threshold . . . . . . . . . . . . . . . . . . . . . . . . . 25 5.1 Non-linearity Errors Metrics . . . . . . . . . . . . . . . . . . . . . . . . 35 5.2 Results of the 3D plane reconstruction . . . . . . . . . . . . . . . . . . . 36 5.3 Network Weight Bit Comparison . . . . . . . . . . . . . . . . . . . . . . 38 5.4 The Effectiveness of CIM non-linearity error . . . . . . . . . . . . . . . 39 5.5 Training Loss of Model Trained for CIM and GPU . . . . . . . . . . . . 40 5.6 Training Loss of Resnet 34 Trained for CIM and GPU . . . . . . . . . . . 40 5.7 The Error of the CIM, GPU, and Refinement 3D Reconstruction Corresponding to the Ground Truth . . . . . . . . . . . . . . . . . . . . . . . 41
dc.language.iso	zh-TW
dc.title	友善於記憶體內運算的深度學習三維重建	zh_TW
dc.title	CIM-Friendly Deep Learning Method for 3D Reconstruction	en
dc.type	Thesis
dc.date.schoolyear	110-2
dc.description.degree	碩士
dc.contributor.oralexamcommittee	傅楸善(Chiou-Shann Fuh),廖弘源(Hong-Yuan Liao),叢培貴(Pei-Kuei Tsung)
dc.subject.keyword	記憶體內運算,量化網路訓練,三維重建,	zh_TW
dc.subject.keyword	Computing in Memory,Quantization Aware Training,3D Reconstruction,	en
dc.relation.page	46
dc.identifier.doi	10.6342/NTU202202632
dc.rights.note	同意授權(限校園內公開)
dc.date.accepted	2022-09-12
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	資訊工程學研究所	zh_TW
dc.date.embargo-lift	2024-09-01	-
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
U0001-2208202210520400.pdf 授權僅限NTU校內IP使用（校園外請利用VPN校外連線服務）	6.35 MB	Adobe PDF	檢視/開啟

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。