Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊工程學系
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/74828
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor廖世偉
dc.contributor.authorBo-Ru Zhaoen
dc.contributor.author趙柏儒zh_TW
dc.date.accessioned2021-06-17T09:08:22Z-
dc.date.available2024-12-02
dc.date.copyright2019-12-02
dc.date.issued2019
dc.date.submitted2019-11-14
dc.identifier.citation[1] Khronos Group, 'The OpenVX API for hardware acceleration', https: //www.khronos.org/openvx, 2013.
[2] OpenVX NNE,
https://www.khronos.org/registry/vx/extensions/neural_network/html/index.html
[3] Ragan-Kelley, J., Adams, A., Paris, S., Levoy, M., Ama-Rainghe, S. and Durand, F., ”Decoupling algorithms from schedules for easy optimization of image processing pipelines”, ACM Transactions on Graphics, 31, 4, 32, 2012.
[4] J. Ragan-Kelley, C. Barnes, A. Adams, S. Paris, F. Durand, and S. Amarasinghe. Halide: a language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines. In Proceedings of the 34th ACM SIGPLAN conference on Programming language design and implementation. ACM, 2013.
[5] NVIDIA CUDA C Programming Guide v10.1. NVIDIA, May. 2019.
[6] Advanced Micro Devices, Inc. AMD APP SDK - A Complete Development Platform, 2015.
[7] R. T. Mullapudi, A. Adams, D. Sharlet, J. Ragan-Kelley, and K. Fatahalian. 2016. Automatically scheduling halide image
processing pipelines. ACM Transactions on Graphics 35, 4, Article 83 (July 2016), 11 pages.
[8] A. Krizhevsky, I. Sutskever, and G. Hinton. Imagenet classification
with deep convolutional neural networks. In Advances in Neural Information Processing Systems 25, pages 1106–1114, 2012.
[9] E. Rainey, J. Villarreal, G. Dedeoglu, K. Pulli, T. Lepley, and F. Brill, “Addressing System-Level Optimization with OpenVX Graphs,” in IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2014, pp. 658–663.
[10] Canis, A., Choi, J., Aldham, M., Zhang, V., Kammoona, A., Anderson, J.H., Brown, S., Czajkowski, T.: LegUp: high-level synthesis for FPGA-based processor/accelerator systems. In: Proceedings of the 19th ACM/SIGDA International Symposium on Field Programmable Gate Arrays, pp. 33–36. ACM (2011)
[11] Gehrig, S.K., Eberli, F., Meyer, T.: A real-time low-power stereo vision engine using semi-global matching. In: Computer Vision Systems, pp. 134–143. Springer (2009)
[12] Lei, Y., Gang, Z., Si-Heon, R., Choon-Young, L., Sang-Ryong, L., Bae, K.M.: The platform of image acquisition and processing system based on DSP and FPGA. In: International Conference on Smart Manufacturing Application, pp. 470–473. IEEE (2008)
[13] Cong, J., Ghodrat, M.A,, Gill, M., Grigorian, B., Reinman, G.: CHARM: a composable heterogeneous accelerator-rich micro- processor. In: Proceedings of the 2012 ACM/IEEE International Symposium on Low Power Electronics and Design, pp. 379–384. ACM (2012)
[14] Cong, J., Liu, C., Ghodrat, M.A., Reinman, G., Gill, M., Zou, Y.: AXR-CMP: architecture support in accelerator-rich CMPs. In: 2nd Workshop on SoC Architecture, Accelerators and Workloads (2011)
[15] Farabet, C., Martini, B., Corda, B., Akselrod, P., Culurciello, E., LeCun, Y.: Neuflow: a runtime reconfigurable dataflow processor for vision. In: 2011 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 109–116. IEEE (2011)
[16] Hegarty, J., Brunhaver, J., DeVito, Z., Ragan-Kelley, J., Cohen, N., Bell, S., Vasilyev, A., Horowitz, M., Hanrahan, P. Darkroom: Compiling high-level image processing code into hardware pipelines. In: Proceedings of the 41st International Conference on Computer Graphics and Interactive Techniques (SIGGRAPH) (2014)
[17] OpenCV Library Homepage. http://www.opencv.com/
[18] Coombs, J., Prabhu, R., Peake, G.: Overcoming the challenges of porting OpenCV to TI’s embedded ARM+ DSP platforms. Int. J. Electr. Eng. Educ. 49(3), 260–274 (2012)
[19] Tegra Android Development Documentation Website. http://docs.nvidia. com/tegra/index.html.
[20] Qualcomm (2015) Computer Vision (FastCV). https://developer. qualcomm.com/computer-vision-fastcv
[21] J. E. Stone, D. Gohara, and G. Shi. OpenCL: A parallel programming standard for heterogeneous computing systems. Computing in science & engineering, 2010.
[22] Czajkowski, T.S., Aydonat, U., Denisenko, D., Freeman, J., Kinsner, M., Neto, D., Wong, J., Yiannacouras, P., Singh, DP.: From OpenCL to high-performance hardware on FPGAs. In: 22nd International Conference on Field Programmable Logic and Applications (FPL), pp. 531–534. IEEE (2012)
[23] P. Boudier and G. Sellers. Memory System on Fusion APUs. AMD fusion developer summit, 2011.
[24] G. Tagliavini et al., “Optimizing memory bandwidth exploitation for openvx applications on embedded many-core accelerators,” Journal of Real-Time Image Processing, 2016
[25] Tagliavini G, Haugou G, Benini L. Optimizing memory bandwidth in OpenVX graph execution on embedded many-core accelerators[C]/lDesign and Architectures for Signal and Image Processing (DASIP), 2014 Conference on. IEEE, 2014: 1-8.
[26] D. Dekkiche, B. Vincke, and A. Merigot, “Investigation and performance analysis of openvx optimizations on computer vision applications,” in 14th International Conference on Control, Automation, Robotics and Vision, 2016, pp. 1–6.
[27] G.Tagliavini,G.Haugou,A.Marongiu,andL.Benini,“ADRENALINE: an OpenVX environment to optimize embedded vision applications on many-core accelerators,” in IEEE 9th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC), 2015, pp. 289–296.
[28] Computer Vision Hardware and Software Market to Reach $48.6 Billion by 2022,
https://www.tractica.com/newsroom/press-releases/computer-vision-hardware-and-
software-market-to-r each-48-6-billion-by-2022
[29] Deep Learning Enterprise Software Spending to Surpass $40 Billion Worldwide by
2024, https://www.tractica.com/newsroom/press-releases/deep-learning-
enterprise-software-spending-to-surpass-40-billion-worldwide-by-2024/
[30] MULLAPUDI, R. T., VASISTA, V., AND BONDHUGULA, U. 2015. PolyMage: Automatic optimization for image processing pipelines. In Proceedings of the Twentieth International Confer- ence on Architectural Support for Programming Languages and Operating Systems, 429–443.
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/74828-
dc.description.abstract在本研究中,我們研究如何使用特定領域程式語言 – Halide來建構框架以便快速設計和優化OpenVX連通圖。Halide是一款高階影像處理語言,其提供開發者在寫程式時將演算法和排程分離,使開發環境變得更友善,Halide也已被證明是一種用於寫高效能影像處理程式的有效系統。我們利用OpenVX和Halide建構框架以執行影像處理,因為Halide具備OpenVX所缺乏的原語排程,故我們使用Halide來執行OpenVX kernels,此方法使開發者能增加更多開發性並達到更好的效能。我們使用五個實驗來測試,所得到的結果顯示使用Halide搭配OpenVX能顯著提升影像處理及卷積神經網路的效能。zh_TW
dc.description.abstractIn this study, we investigate how to use a Domain-Specific Language – Halide to build a framework for fast prototyping and optimization of OpenVX graphs. Halide is a new high-level image processing pipeline language. It offers developers to separate the program into algorithms and schedule. This makes developers program friendly. The Halide image processing language has also proven to be an effective system for authoring high-performance image processing code. We built a framework with OpenVX and Halide to implement the image processing system. Since OpenVX is a lack of scheduling primitives, but Halide does. We implemented Halide into OpenVX graphs. This method can significantly improve the performance of image processing and convolutional neural networks.en
dc.description.provenanceMade available in DSpace on 2021-06-17T09:08:22Z (GMT). No. of bitstreams: 1
ntu-108-R06922013-1.pdf: 7505472 bytes, checksum: 7594f1e5fc7963b4c21f0c78a884da89 (MD5)
Previous issue date: 2019
en
dc.description.tableofcontents口試委員會審定書 #
致謝 i
摘要 ii
Abstract iii
Contents iv
圖目錄 vi
表目錄 vii
Chapter 1 緒論 1
1.1 研究背景 1
1.2 研究動機 3
1.3 研究方法 4
Chapter 2 研究背景及文獻探討 5
2.1 OpenVX 5
2.2 Halide 11
2.2.1 生產者-消費者局部性排程 14
2.2.2 輸入資料重用排程 15
2.3 研究背景 16
Chapter 3 實驗設計及優化方法 18
3.1 排程優化問題 18
3.2 排程演算法 19
3.2.1 函式預處理 20
3.2.2 函式分組和平鋪 21
3.2.3 函式內嵌 25
3.2.4 最終函式產生 26
3.3 實驗設計 27
Chapter 4 實驗結果 28
4.1 OpenVX與Halide資料存取模式比較 28
4.2 OpenVX 上單一kernel替換成Halide 30
4.3 OpenVX 上使用Halide實現Kernels Merge 32
4.4 OpenVX 上使用Halide對連通圖做優化 33
4.5 OpenVX NNE 上使用Halide做優化 34
Chapter 5 結論 37
參考文獻 38
dc.language.isozh-TW
dc.subjectHalidezh_TW
dc.subject影像處理zh_TW
dc.subject卷積神經網路zh_TW
dc.subjectOpenVXzh_TW
dc.subjectOpenVXen
dc.subjectHalideen
dc.subjectimage processingen
dc.subjectconvolutional neural networksen
dc.title使用Halide框架設計和優化OpenVX應用zh_TW
dc.titleDesign and Optimization of OpenVX Applications with Halide frameworken
dc.typeThesis
dc.date.schoolyear108-1
dc.description.degree碩士
dc.contributor.oralexamcommittee洪士灝,游逸平,洪明郁,葉羅堯
dc.subject.keywordOpenVX,Halide,影像處理,卷積神經網路,zh_TW
dc.subject.keywordOpenVX,Halide,image processing,convolutional neural networks,en
dc.relation.page41
dc.identifier.doi10.6342/NTU201904278
dc.rights.note有償授權
dc.date.accepted2019-11-14
dc.contributor.author-college電機資訊學院zh_TW
dc.contributor.author-dept資訊工程學研究所zh_TW
顯示於系所單位:資訊工程學系

文件中的檔案:
檔案 大小格式 
ntu-108-1.pdf
  未授權公開取用
7.33 MBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved