請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/81584完整後設資料紀錄
| DC 欄位 | 值 | 語言 |
|---|---|---|
| dc.contributor.advisor | 郭大維(Tei-Wei Kuo) | |
| dc.contributor.author | Li-Yu Li | en |
| dc.contributor.author | 李立譽 | zh_TW |
| dc.date.accessioned | 2022-11-24T09:24:25Z | - |
| dc.date.available | 2022-11-24T09:24:25Z | - |
| dc.date.copyright | 2021-11-08 | |
| dc.date.issued | 2021 | |
| dc.date.submitted | 2021-09-25 | |
| dc.identifier.citation | [1] Alfred V. Aho, Monica S. Lam, Ravi Sethi, and Jeffrey D. Ullman. Compilers: Principles, Techniques, and Tools (2nd Edition). AddisonWesley Longman Pub lishing Co., Inc., USA, 2006. [2] LSusanBlackford,AntoinePetitet,RoldanPozo,KarinRemington,RClintWhaley, James Demmel, Jack Dongarra, Iain Duff, Sven Hammarling, Greg Henry, et al. An updated set of basic linear algebra subprograms (blas). ACM Transactions on Mathematical Software, 28(2):135–151, 2002. [3] Steve Carr, Kathryn S. McKinley, and ChauWen Tseng. Compiler optimizations for improving data locality. In Proceedings of the Sixth International Conference on Architectural Support for Programming Languages and Operating Systems, AS PLOS VI, page 252–262, New York, NY, USA, 1994. Association for Computing Machinery. [4] Jacqueline Chame and Sungdo Moon. A tile selection algorithm for data locality and cache interference. In Proceedings of the 13th International Conference on Supercomputing, ICS ’99, page 492–499, New York, NY, USA, 1999. Association for Computing Machinery. [5] Kumar Chellapilla, Sidd Puri, and Patrice Simard. High performance convolutional neural networks for document processing. In Tenth international workshop on frontiers in handwriting recognition. Suvisoft, 2006. [6] Minsik Cho and Daniel Brand. MEC: Memoryefficient convolution for deep neu ral network. In Doina Precup and Yee Whye Teh, editors, Proceedings of the 34th International Conference on Machine Learning, volume 70 of Proceedings of Machine Learning Research, pages 815–824. PMLR, 06–11 Aug 2017. [7] Stephanie Coleman and Kathryn S. McKinley. Tile size selection using cache orga nization and data layout. SIGPLAN Not., 30(6):279–290, June 1995. [8] Alexei Colin, Emily Ruppel, and Brandon Lucia. A reconfigurable energy stor age architecture for energyharvesting devices. In Proceedings of the TwentyThird International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS ’18, page 767–781, New York, NY, USA, 2018. As sociation for Computing Machinery. [9] Jack J Dongarra, Jeremy Du Croz, Sven Hammarling, and Iain S Duff. A set of level 3 basic linear algebra subprograms. ACM Transactions on Mathematical Software (TOMS), 16(1):1–17, 1990. [10] Graham Gobieski, Brandon Lucia, and Nathan Beckmann. Intelligence beyond the edge: Inference on intermittent embedded systems. In Proceedings of the TwentyFourth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS ’19, page 199–213, New York, NY, USA, 2019. Association for Computing Machinery. [11] Stefan Hadjis, Firas Abuzaid, Ce Zhang, and Christopher Ré. Caffe con troll: Shal low ideas to speed up deep learning. In Proceedings of the Fourth Workshop on Data analytics in the Cloud, pages 1–4, 2015. [12] PICHENG HSIU HASHAN ROSHANTHA MENDIS, CHIHKAI KANG. Intermittentaware neural architecture search. 2021. [13] Josiah Hester and Jacob Sorber. The future of sensing is batteryless, intermittent, and awesome. In Proceedings of the 15th ACM Conference on Embedded Network Sensor Systems, SenSys ’17, New York, NY, USA, 2017. Association for Computing Machinery. [14] TexasInstruments.Energytrace.http://www.ti.com/tool/ENERGYTRACE,2014. [15] Texas Instruments. Lowenergy accelerator(lea). https://www.ti.com/lit/an/ slaa720/slaa720.pdf, 2016. [16] Texas Instruments. Msp4305994 mcu. https://www.ti.com/product/ MSP430FR5994, 2018. [17] TexasInstruments.Msp430x5xxandmsp430x6xxdmacontrollermodule.https: //www.ti.com/lit/ug/slau395f/slau395f, 2018. [18] D. Anguita A. GhioL. Oneto X. Parra J. L. ReyesOrtiz. A public domain dataset for human activity recognition using smartphones. https://archive.ics.uci.edu/ ml/datasets/human+activity+recognition+using+smartphones, 2013. [19] HrishikeshJayakumar,KangwooLee,WooSukLee,ArnabRaha,YounghyunKim, and Vijay Raghunathan. Powering the internet of things. In Proceedings of the 2014 international symposium on Low power electronics and design, pages 375– 380, 2014. [20] HrishikeshJayakumar,ArnabRaha,andVijayRaghunathan.Energyawarememory mapping for hybrid framsram mcus in iot edge devices. In 2016 29th International Conference on VLSI Design and 2016 15th International Conference on Embedded Systems (VLSID), pages 264–269. IEEE, 2016. [21] YangqingJia,EvanShelhamer,JeffDonahue,SergeyKarayev,JonathanLong,Ross Girshick, Sergio Guadarrama, and Trevor Darrell. Caffe: Convolutional architec ture for fast feature embedding. In Proceedings of the 22nd ACM international conference on Multimedia, pages 675–678, 2014. [22] ChihKai Kang, ChunHan Lin, PiCheng Hsiu, and MingSyan Chen. Homerun: Hw/sw codesign for program atomicity on selfpowered intermittent systems. In Proceedings of the International Symposium on Low Power Electronics and Design, pages 1–6, 2018. [23] ChihKai Kang, Hashan Roshantha Mendis, ChunHan Lin, MingSyan Chen, and PiCheng Hsiu. Everything leaves footprints: Hardware accelerated intermittent deep inference. IEEE Transactions on ComputerAided Design of Integrated Circuits and Systems, 39(11):3479–3491, 2020. [24] Chuck L Lawson, Richard J. Hanson, David R Kincaid, and Fred T. Krogh. Basic linear algebra subprograms for fortran usage. ACM Transactions on Mathematical Software (TOMS), 5(3):308–323, 1979. [25] Yongpan Liu, Zewei Li, Hehe Li, Yiqun Wang, Xueqing Li, Kaisheng Ma, Shuangchen Li, MengFan Chang, Sampson John, Yuan Xie, Jiwu Shu, and Huazhong Yang. Ambient energy harvesting nonvolatile processors: From circuit to system. In Proceedings of the 52nd Annual Design Automation Conference, DAC ’15, New York, NY, USA, 2015. Association for Computing Machinery. [26] Kaisheng Ma, Xueqing Li, Shuangchen Li, Yongpan Liu, John Jack Sampson, Yuan Xie, and Vijaykrishnan Narayanan. Nonvolatile processor architecture exploration for energyharvesting applications. IEEE Micro, 35(5):32–40, 2015. [27] Kaisheng Ma, Yang Zheng, Shuangchen Li, Karthik Swaminathan, Xueqing Li, Yongpan Liu, Jack Sampson, Yuan Xie, and Vijaykrishnan Narayanan. Architec ture exploration for ambient energy harvesting nonvolatile processors. In 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA), pages 526–537, 2015. [28] Farzad Samie, Lars Bauer, and Jörg Henkel. Iot technologies for embedded comput ing: A survey. In 2016 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ ISSS), pages 1–10. IEEE, 2016. [29] Vivienne Sze, YuHsin Chen, TienJu Yang, and Joel S. Emer. Efficient process ing of deep neural networks: A tutorial and survey. Proceedings of the IEEE, 105(12):2295–2329, 2017. [30] Aravind Vasudevan, Andrew Anderson, and David Gregg. Parallel multi channel convolution using general matrix multiplication. In 2017 IEEE 28th international conference on applicationspecific systems, architectures and processors (ASAP), pages 19–24. IEEE, 2017. [31] Kasım Sinan Yıldırım, Amjad Yousef Majid, Dimitris Patoukas, Koen Schaper, Przemyslaw Pawelczak, and Josiah Hester. Ink: Reactive kernel for tiny battery less sensors. In Proceedings of the 16th ACM Conference on Embedded Networked Sensor Systems, pages 41–53, 2018. [32] Chen Zhang, Peng Li, Guangyu Sun, Yijin Guan, Bingjun Xiao, and Jason Cong. Optimizing fpgabased accelerator design for deep convolutional neural networks. In Proceedings of the 2015 ACM/SIGDA international symposium on fieldprogrammable gate arrays, pages 161–170, 2015. | |
| dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/81584 | - |
| dc.description.abstract | 自供電系統已逐漸成為物聯網 (IoT) 設備可持續性的解決方案,但不穩定的電源供應也為計算量龐大的應用程序(例如,間歇性深度推論)帶來新的挑戰。近年來,在眾多研究者的努力下,深度推論已經可以在間歇性系統上執行。然而我們觀察到,因為間歇性系統斷斷續續執行的特性,傳統上用來優化深度推論的分塊演算法在電源不穩定時,會比不做任何優化有著更差的效能。 本文提出了一種間歇性分塊優化,通過考慮間歇性行為來找到適合的分塊尺寸。藉由提出間歇分塊尺寸搜尋器和間歇性能模型,我們可以利用它們在資料再用和間歇執行成本之間找到適當的平衡。我們在德州儀器設備上實作和部署我們提出的方法,並評估不同場景下的性能。與傳統的分塊優化相比,我們的方法可以保證在能量預算內安全執行,並根據不同場景實現從 11% 到 84% 的更高性能。 | zh_TW |
| dc.description.provenance | Made available in DSpace on 2022-11-24T09:24:25Z (GMT). No. of bitstreams: 1 U0001-2209202122531900.pdf: 7463903 bytes, checksum: f7236252c8ce110b5509a5318bc32df1 (MD5) Previous issue date: 2021 | en |
| dc.description.tableofcontents | Acknowledgements -- ii 摘要 -- iii Abstract -- iv Contents -- vii List of Figures -- ix Chapter 1 Introduction -- 1 Chapter 2 Background -- 5 2.1 Deep Neural Network Inference -- 5 2.2 GeMM Tiling Optimizaition -- 7 2.3 Intermittent Deep Inference -- 9 Chapter 3 Motivation and Challenges -- 11 Chapter 4 Methodology -- 14 4.1 Overview -- 14 4.2 Tile Size Explorer-- 16 4.3 Performance Model -- 18 4.3.1 Objective Function -- 18 4.3.2 Cycle Throughput-- 19 4.3.3 Attainable Computational Thrpughput-- 20 4.4 Generalty -- 22 Chapter 5 Implementation -- 24 5.1 Input Profiling -- 24 5.2 Accelerated Intermittent Inference -- 25 Chapter 6 Performance Evaluation -- 27 6.1 Experimental Setup -- 27 6.2 Experimental Results -- 28 6.2.1 GeMM Inference Latency -- 28 6.2.2 DNN Inference Latency -- 29 Chapter 7 Conclusion -- 31 References -- 33 | |
| dc.language.iso | en | |
| dc.subject | 深度推論 | zh_TW |
| dc.subject | 自供電系統 | zh_TW |
| dc.subject | 物聯網 | zh_TW |
| dc.subject | 間歇性系統 | zh_TW |
| dc.subject | 分塊優化 | zh_TW |
| dc.subject | Energy harvesting | en |
| dc.subject | Internet of Things | en |
| dc.subject | Intermittent system | en |
| dc.subject | Deep inference | en |
| dc.subject | Tiling optimization | en |
| dc.title | 間歇性深度推論之分塊尺寸最佳化 | zh_TW |
| dc.title | Intermittent-aware Tile Size Optimization for Deep Inference | en |
| dc.date.schoolyear | 109-2 | |
| dc.description.degree | 碩士 | |
| dc.contributor.coadvisor | 修丕承(Pi-Cheng Hsiu) | |
| dc.contributor.oralexamcommittee | 王克中(Hsin-Tsai Liu),張原豪(Chih-Yang Tseng),洪士灝 | |
| dc.subject.keyword | 自供電系統,物聯網,間歇性系統,分塊優化,深度推論, | zh_TW |
| dc.subject.keyword | Energy harvesting,Internet of Things,Intermittent system,Deep inference,Tiling optimization, | en |
| dc.relation.page | 38 | |
| dc.identifier.doi | 10.6342/NTU202103299 | |
| dc.rights.note | 未授權 | |
| dc.date.accepted | 2021-09-27 | |
| dc.contributor.author-college | 電機資訊學院 | zh_TW |
| dc.contributor.author-dept | 資訊網路與多媒體研究所 | zh_TW |
| 顯示於系所單位: | 資訊網路與多媒體研究所 | |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| U0001-2209202122531900.pdf 未授權公開取用 | 7.29 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
