Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊網路與多媒體研究所
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/81584
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor郭大維(Tei-Wei Kuo)
dc.contributor.authorLi-Yu Lien
dc.contributor.author李立譽zh_TW
dc.date.accessioned2022-11-24T09:24:25Z-
dc.date.available2022-11-24T09:24:25Z-
dc.date.copyright2021-11-08
dc.date.issued2021
dc.date.submitted2021-09-25
dc.identifier.citation[1] Alfred V. Aho, Monica S. Lam, Ravi Sethi, and Jeffrey D. Ullman. Compilers: Principles, Techniques, and Tools (2nd Edition). Addison­Wesley Longman Pub­ lishing Co., Inc., USA, 2006. [2] LSusanBlackford,AntoinePetitet,RoldanPozo,KarinRemington,RClintWhaley, James Demmel, Jack Dongarra, Iain Duff, Sven Hammarling, Greg Henry, et al. An updated set of basic linear algebra subprograms (blas). ACM Transactions on Mathematical Software, 28(2):135–151, 2002. [3] Steve Carr, Kathryn S. McKinley, and Chau­Wen Tseng. Compiler optimizations for improving data locality. In Proceedings of the Sixth International Conference on Architectural Support for Programming Languages and Operating Systems, AS­ PLOS VI, page 252–262, New York, NY, USA, 1994. Association for Computing Machinery. [4] Jacqueline Chame and Sungdo Moon. A tile selection algorithm for data locality and cache interference. In Proceedings of the 13th International Conference on Supercomputing, ICS ’99, page 492–499, New York, NY, USA, 1999. Association for Computing Machinery. [5] Kumar Chellapilla, Sidd Puri, and Patrice Simard. High performance convolu­tional neural networks for document processing. In Tenth international workshop on frontiers in handwriting recognition. Suvisoft, 2006. [6] Minsik Cho and Daniel Brand. MEC: Memory­efficient convolution for deep neu­ ral network. In Doina Precup and Yee Whye Teh, editors, Proceedings of the 34th International Conference on Machine Learning, volume 70 of Proceedings of Machine Learning Research, pages 815–824. PMLR, 06–11 Aug 2017. [7] Stephanie Coleman and Kathryn S. McKinley. Tile size selection using cache orga­ nization and data layout. SIGPLAN Not., 30(6):279–290, June 1995. [8] Alexei Colin, Emily Ruppel, and Brandon Lucia. A reconfigurable energy stor­ age architecture for energy­harvesting devices. In Proceedings of the Twenty­Third International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS ’18, page 767–781, New York, NY, USA, 2018. As­ sociation for Computing Machinery. [9] Jack J Dongarra, Jeremy Du Croz, Sven Hammarling, and Iain S Duff. A set of level 3 basic linear algebra subprograms. ACM Transactions on Mathematical Software (TOMS), 16(1):1–17, 1990. [10] Graham Gobieski, Brandon Lucia, and Nathan Beckmann. Intelligence beyond the edge: Inference on intermittent embedded systems. In Proceedings of the Twenty­Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS ’19, page 199–213, New York, NY, USA, 2019. Association for Computing Machinery. [11] Stefan Hadjis, Firas Abuzaid, Ce Zhang, and Christopher Ré. Caffe con troll: Shal­ low ideas to speed up deep learning. In Proceedings of the Fourth Workshop on Data analytics in the Cloud, pages 1–4, 2015. [12] PI­CHENG HSIU HASHAN ROSHANTHA MENDIS, CHIH­KAI KANG. Intermittent­aware neural architecture search. 2021. [13] Josiah Hester and Jacob Sorber. The future of sensing is batteryless, intermittent, and awesome. In Proceedings of the 15th ACM Conference on Embedded Network Sensor Systems, SenSys ’17, New York, NY, USA, 2017. Association for Computing Machinery. [14] TexasInstruments.Energytrace.http://www.ti.com/tool/ENERGYTRACE,2014. [15] Texas Instruments. Low­energy accelerator(lea). https://www.ti.com/lit/an/ slaa720/slaa720.pdf, 2016. [16] Texas Instruments. Msp4305994 mcu. https://www.ti.com/product/ MSP430FR5994, 2018. [17] TexasInstruments.Msp430x5xxandmsp430x6xx­dmacontrollermodule.https: //www.ti.com/lit/ug/slau395f/slau395f, 2018. [18] D. Anguita A. GhioL. Oneto X. Parra J. L. Reyes­Ortiz. A public domain dataset for human activity recognition using smartphones. https://archive.ics.uci.edu/ ml/datasets/human+activity+recognition+using+smartphones, 2013. [19] HrishikeshJayakumar,KangwooLee,WooSukLee,ArnabRaha,YounghyunKim, and Vijay Raghunathan. Powering the internet of things. In Proceedings of the 2014 international symposium on Low power electronics and design, pages 375– 380, 2014. [20] HrishikeshJayakumar,ArnabRaha,andVijayRaghunathan.Energy­awarememory mapping for hybrid fram­sram mcus in iot edge devices. In 2016 29th International Conference on VLSI Design and 2016 15th International Conference on Embedded Systems (VLSID), pages 264–269. IEEE, 2016. [21] YangqingJia,EvanShelhamer,JeffDonahue,SergeyKarayev,JonathanLong,Ross Girshick, Sergio Guadarrama, and Trevor Darrell. Caffe: Convolutional architec­ ture for fast feature embedding. In Proceedings of the 22nd ACM international conference on Multimedia, pages 675–678, 2014. [22] Chih­Kai Kang, Chun­Han Lin, Pi­Cheng Hsiu, and Ming­Syan Chen. Homerun: Hw/sw co­design for program atomicity on self­powered intermittent systems. In Proceedings of the International Symposium on Low Power Electronics and Design, pages 1–6, 2018. [23] Chih­Kai Kang, Hashan Roshantha Mendis, Chun­Han Lin, Ming­Syan Chen, and Pi­Cheng Hsiu. Everything leaves footprints: Hardware accelerated intermittent deep inference. IEEE Transactions on Computer­Aided Design of Integrated Circuits and Systems, 39(11):3479–3491, 2020. [24] Chuck L Lawson, Richard J. Hanson, David R Kincaid, and Fred T. Krogh. Basic linear algebra subprograms for fortran usage. ACM Transactions on Mathematical Software (TOMS), 5(3):308–323, 1979. [25] Yongpan Liu, Zewei Li, Hehe Li, Yiqun Wang, Xueqing Li, Kaisheng Ma, Shuangchen Li, Meng­Fan Chang, Sampson John, Yuan Xie, Jiwu Shu, and Huazhong Yang. Ambient energy harvesting nonvolatile processors: From circuit to system. In Proceedings of the 52nd Annual Design Automation Conference, DAC ’15, New York, NY, USA, 2015. Association for Computing Machinery. [26] Kaisheng Ma, Xueqing Li, Shuangchen Li, Yongpan Liu, John Jack Sampson, Yuan Xie, and Vijaykrishnan Narayanan. Nonvolatile processor architecture exploration for energy­harvesting applications. IEEE Micro, 35(5):32–40, 2015. [27] Kaisheng Ma, Yang Zheng, Shuangchen Li, Karthik Swaminathan, Xueqing Li, Yongpan Liu, Jack Sampson, Yuan Xie, and Vijaykrishnan Narayanan. Architec­ ture exploration for ambient energy harvesting nonvolatile processors. In 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA), pages 526–537, 2015. [28] Farzad Samie, Lars Bauer, and Jörg Henkel. Iot technologies for embedded comput­ ing: A survey. In 2016 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ ISSS), pages 1–10. IEEE, 2016. [29] Vivienne Sze, Yu­Hsin Chen, Tien­Ju Yang, and Joel S. Emer. Efficient process­ ing of deep neural networks: A tutorial and survey. Proceedings of the IEEE, 105(12):2295–2329, 2017. [30] Aravind Vasudevan, Andrew Anderson, and David Gregg. Parallel multi channel convolution using general matrix multiplication. In 2017 IEEE 28th international conference on application­specific systems, architectures and processors (ASAP), pages 19–24. IEEE, 2017. [31] Kasım Sinan Yıldırım, Amjad Yousef Majid, Dimitris Patoukas, Koen Schaper, Przemyslaw Pawelczak, and Josiah Hester. Ink: Reactive kernel for tiny battery­ less sensors. In Proceedings of the 16th ACM Conference on Embedded Networked Sensor Systems, pages 41–53, 2018. [32] Chen Zhang, Peng Li, Guangyu Sun, Yijin Guan, Bingjun Xiao, and Jason Cong. Optimizing fpga­based accelerator design for deep convolutional neural networks. In Proceedings of the 2015 ACM/SIGDA international symposium on field­programmable gate arrays, pages 161–170, 2015.
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/81584-
dc.description.abstract自供電系統已逐漸成為物聯網 (IoT) 設備可持續性的解決方案,但不穩定的電源供應也為計算量龐大的應用程序(例如,間歇性深度推論)帶來新的挑戰。近年來,在眾多研究者的努力下,深度推論已經可以在間歇性系統上執行。然而我們觀察到,因為間歇性系統斷斷續續執行的特性,傳統上用來優化深度推論的分塊演算法在電源不穩定時,會比不做任何優化有著更差的效能。 本文提出了一種間歇性分塊優化,通過考慮間歇性行為來找到適合的分塊尺寸。藉由提出間歇分塊尺寸搜尋器和間歇性能模型,我們可以利用它們在資料再用和間歇執行成本之間找到適當的平衡。我們在德州儀器設備上實作和部署我們提出的方法,並評估不同場景下的性能。與傳統的分塊優化相比,我們的方法可以保證在能量預算內安全執行,並根據不同場景實現從 11% 到 84% 的更高性能。zh_TW
dc.description.provenanceMade available in DSpace on 2022-11-24T09:24:25Z (GMT). No. of bitstreams: 1
U0001-2209202122531900.pdf: 7463903 bytes, checksum: f7236252c8ce110b5509a5318bc32df1 (MD5)
Previous issue date: 2021
en
dc.description.tableofcontentsAcknowledgements -- ii 摘要 -- iii Abstract -- iv Contents -- vii List of Figures -- ix Chapter 1 Introduction -- 1 Chapter 2 Background -- 5 2.1 Deep Neural Network Inference -- 5 2.2 GeMM Tiling Optimizaition -- 7 2.3 Intermittent Deep Inference -- 9 Chapter 3 Motivation and Challenges -- 11 Chapter 4 Methodology -- 14 4.1 Overview -- 14 4.2 Tile Size Explorer-- 16 4.3 Performance Model -- 18 4.3.1 Objective Function -- 18 4.3.2 Cycle Throughput-- 19 4.3.3 Attainable Computational Thrpughput-- 20 4.4 Generalty -- 22 Chapter 5 Implementation -- 24 5.1 Input Profiling -- 24 5.2 Accelerated Intermittent Inference -- 25 Chapter 6 Performance Evaluation -- 27 6.1 Experimental Setup -- 27 6.2 Experimental Results -- 28 6.2.1 GeMM Inference Latency -- 28 6.2.2 DNN Inference Latency -- 29 Chapter 7 Conclusion -- 31 References -- 33
dc.language.isoen
dc.subject深度推論zh_TW
dc.subject自供電系統zh_TW
dc.subject物聯網zh_TW
dc.subject間歇性系統zh_TW
dc.subject分塊優化zh_TW
dc.subjectEnergy harvestingen
dc.subjectInternet of Thingsen
dc.subjectIntermittent systemen
dc.subjectDeep inferenceen
dc.subjectTiling optimizationen
dc.title間歇性深度推論之分塊尺寸最佳化zh_TW
dc.titleIntermittent-aware Tile Size Optimization for Deep Inferenceen
dc.date.schoolyear109-2
dc.description.degree碩士
dc.contributor.coadvisor修丕承(Pi-Cheng Hsiu)
dc.contributor.oralexamcommittee王克中(Hsin-Tsai Liu),張原豪(Chih-Yang Tseng),洪士灝
dc.subject.keyword自供電系統,物聯網,間歇性系統,分塊優化,深度推論,zh_TW
dc.subject.keywordEnergy harvesting,Internet of Things,Intermittent system,Deep inference,Tiling optimization,en
dc.relation.page38
dc.identifier.doi10.6342/NTU202103299
dc.rights.note未授權
dc.date.accepted2021-09-27
dc.contributor.author-college電機資訊學院zh_TW
dc.contributor.author-dept資訊網路與多媒體研究所zh_TW
顯示於系所單位:資訊網路與多媒體研究所

文件中的檔案:
檔案 大小格式 
U0001-2209202122531900.pdf
  未授權公開取用
7.29 MBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved