Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊網路與多媒體研究所
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/88343
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor廖世偉zh_TW
dc.contributor.advisorShih-Wei Liaoen
dc.contributor.author王祥宇zh_TW
dc.contributor.authorXiang-Yu Wangen
dc.date.accessioned2023-08-09T16:38:17Z-
dc.date.available2023-11-09-
dc.date.copyright2023-08-09-
dc.date.issued2023-
dc.date.submitted2023-07-25-
dc.identifier.citation[1] M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, M. Isard, M. Kudlur, J. Levenberg, R. Monga, S. Moore, D. G. Murray, B. Steiner, P. Tucker, V. Vasudevan, P. Warden, M. Wicke, Y. Yu, and X. Zheng. Tensorflow: A system for large-scale machine learning. In 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16), pages 265–283, 2016.
[2] A. H. Ashouri, W. Killian, J. Cavazos, G. Palermo, and C. Silvano. A survey on compiler autotuning using machine learning. ACM Computing Surveys (CSUR), 51(5):1–42, 2018.
[3] C. Bastoul. Code generation in the polyhedral model is easier than you think. In PACT’13 IEEE International Conference on Parallel Architecture and Compilation Techniques, pages 7–16, Juan-les-Pins, France, September 2004.
[4] M.-W. Benabderrahmane, L.-N. Pouchet, A. Cohen, and C. Bastoul. The polyhedral model is more widely applicable than you think. In R. Gupta, editor, Compiler Construction, pages 283–303, Berlin, Heidelberg, 2010. Springer Berlin Heidelb
[5] A. Bik, P. Koanantakool, T. Shpeisman, N. Vasilache, B. Zheng, and F. Kjols Compiler support for sparse tensor computations in mlir. ACM Transactions on Architecture and Code Optimization (TACO), 19(4):1–25, 2022.
[6] U. Bondhugula, A. Hartono, J. Ramanujam, and P. Sadayappan. A practical automatic polyhedral parallelizer and locality optimizer. SIGPLAN Not., 43(6):101–113, June 2008.
[7] A. Brauckmann, A. Goens, and J. Castrillon. Polygym: Polyhedral optimizations as an environment for reinforcement learning. In 2021 30th International Conference on Parallel Architectures and Compilation Techniques (PACT), pages 17–29, 2021.
[8] T. Chen, T. Moreau, Z. Jiang, L. Zheng, E. Yan, M. Cowan, H. Shen, L. Wang, Y. Hu, L. Ceze, C. Guestrin, and A. Krishnamurthy. Tvm: An automated end-to-end optimizing compiler for deep learning, 2018.
[9] J. Clemons, H. Zhu, S. Savarese, and T. Austin. Mevbench: A mobile computer
vision benchmarking suite. In 2011 IEEE International Symposium on Workload Characterization (IISWC), pages 91–102, 2011.
[10] J. Clemons, H. Zhu, S. Savarese, and T. Austin. Mevbench: A mobile computer vision benchmarking suite. In 2011 IEEE International Symposium on Workload Characterization (IISWC), pages 91–102, 2011.
[11] N. Dalal and B. Triggs. Histograms of oriented gradients for human detection.
In 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), volume 1, pages 886–893 vol. 1, 2005.
[12] G. Fursin, Y. Kashnikov, A. W. Memon, Z. Chamski, O. Temam, M. Namolaru, E. Yom-Tov, B. Mendelson, A. Zaks, E. Courtois, et al. Milepost gcc: Machin learning enabled self-tuning compiler. International journal of parallel programming, 39:296–327, 2011.
[13] A. Haj-Ali, N. K. Ahmed, T. Willke, Y. S. Shao, K. Asanovic, and I. Stoica. Neurovectorizer: End-to-end vectorization with deep reinforcement learning. In Proceedings of the 18th ACM/IEEE International Symposium on Code Generation and Optimization, pages 242–255, 2020.
[14] Q. Huang, A. Haj-Ali, W. Moses, J. Xiang, I. Stoica, K. Asanovic, and J. Wawrzynek. Autophase: Compiler phase-ordering for hls with deep reinforcement learning. In 2019 IEEE 27th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM), pages 308–308. IEEE, 2019.
[15] S. Jain, Y. Andaluri, S. VenkataKeerthy, and R. Upadrasta. Poset-rl: Phase ordering for optimizing size and execution time using reinforcement learning. In 2022 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), pages 121–131. IEEE, 2022.
[16] D. P. Kingma and J. Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
[17] S. Kirkpatrick, C. D. Gelatt Jr, and M. P. Vecchi. Optimization by simulated annealing. science, 220(4598):671–680, 1983.
[18] M. D. Lam, E. E. Rothberg, and M. E. Wolf. The cache performance and optimizations of blocked algorithms. ACM SIGOPS Operating Systems Review, 25(Special Issue):63–74, 1991.
[19] C. Lattner and V. Adve. Llvm: A compilation framework for lifelong program ysis amp; transformation. In Proceedings of the International Symposium on Code Generation and Optimization: Feedback-Directed and Runtime Optimization, CGO ’04, page 75, USA, 2004. IEEE Computer Society.
[20] C. Lattner, M. Amini, U. Bondhugula, A. Cohen, A. Davis, J. Pienaar, R. Riddle, T. Shpeisman, N. Vasilache, and O. Zinenko. Mlir: Scaling compiler infrastructure for domain specific computation. In 2021 IEEE/ACM International Symposium on Code Generation and Optimization (CGO), pages 2–14, 2021.
[21] W. S. Moses, L. Chelini, R. Zhao, and O. Zinenko. Polygeist: Raising c to polyhedral mlir. In 2021 30th International Conference on Parallel Architectures and Compilation Techniques (PACT), pages 45–59, 2021.
[22] R. T. Mullapudi, A. Adams, D. Sharlet, J. Ragan-Kelley, and K. Fatahalian. Automatically scheduling halide image processing pipelines. ACM Trans. Graph., 35(4), July 2016.
[23] M. Nagiub and W. Farag. Automatic selection of compiler options using genetic techniques for embedded software design. In 2013 IEEE 14th international symposium on computational intelligence and informatics (CINTI), pages 69–74. IEEE, 2013.
[24] Y. Nourani and B. Andresen. A comparison of simulated annealing cooling strategies. Journal of Physics A: Mathematical and General, 31(41):8373, 1998.
[25] J. Ragan-Kelley, C. Barnes, A. Adams, S. Paris, F. Durand, and S. Amarasinghe. Halide: A language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines. In Proceedings of the 34th ACM SIGPL Conference on Programming Language Design and Implementation, PLDI ’13, page 519–530, New York, NY, USA, 2013. Association for Computing Machinery.
[26] F. Romeo and A. Sangiovanni-Vincentelli. A theoretical framework for simulated annealing. Algorithmica, 6:302–345, 1991.
[27] M. O. T. Kisuki, P.M.W. Knijnenburg. Combined selection of tile sizes and unroll
factors using iterative compilation. In Proceedings 2000 International Conference on Parallel Architectures and Compilation Techniques, 2000.
[28] S. Verdoolaege. Isl: An integer set library for the polyhedral model. In Proceedings of the Third International Congress Conference on Mathematical Software, ICMS’10, page 299–302, Berlin, Heidelberg, 2010. Springer-Verlag.
[29] E. Wang, Q. Zhang, B. Shen, G. Zhang, X. Lu, Q. Wu, and Y. Wang. Intel Math Kernel Library, pages 167–188. Springer International Publishing, Cham, 2014.
[30] S.-L. Wu, X.-Y. Wang, M.-Y. Peng, and S.-W. Liao. Accelerating openvx through halide and mlir. Journal of Signal Processing Systems, pages 1–14, 2023.
[31] Z. Xianyi, W. Qian, and Z. Yunquan. Model-driven level 3 blas performance optimization on loongson 3a processor. In 2012 IEEE 18th international conference on parallel and distributed systems, pages 684–691. IEEE, 20
-
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/88343-
dc.description.abstract電腦視覺算法在影像中心的設備上的使用越來越多,這使得我們需要優化這些演算法以提高其在資源受限的平台上的性能。Halide是一種在C++上的特定領域語言用來處理高性能圖像處理演算法,可以將演算法與指令排程分離。本論文提出了一種使用模擬退火的隨機演算法優化排程的方法,以提高Halide電腦視覺演算法的性能,從而實現在有限時間內探索仿射轉換的全局最優解。為了將Halide程式碼轉換為MLIR,我們使用了新的編譯流程,即Halide到MLIR(HTM)編譯器。此方法的效能將在不同平台上進行評估,例如x86、Arm和RISC-V。研究展示了MLIR在仿射方言(Affine Dialect)上的轉換和優化能力,並顯示了使用隨機演算法優化基礎建設以充分最佳化MLIR不同優化能力的必要性。zh_TW
dc.description.abstractThe increasing usage of computer vision algorithms in camera-centric devices has led to a growing need for optimizing these algorithms to improve their performance on resource-constrained platforms. Halide is a language specific to image processing algorithms that separates the algorithm's scheduling from its implementation, resulting in high performance. This thesis proposes an approach to improve the performance of Halide computer vision algorithms using stochastic algorithms such as simulated annealing to optimize scheduling, which enables exploring the global optimum of affine scheduling in constrained time. To convert the Halide program to MLIR, we use novel compile flows, namely the Halide to MLIR (HTM) converter. The efficacy of the approach will be evaluated on different platforms, such as x86, ARM, and RISC-V. The study demonstrates the potential of MLIR's transformation and optimization capabilities on Affine dialects and highlights the need for tuning infrastructure to fully leverage MLIR's optimization capabilities.en
dc.description.provenanceSubmitted by admin ntu (admin@lib.ntu.edu.tw) on 2023-08-09T16:38:17Z
No. of bitstreams: 0
en
dc.description.provenanceMade available in DSpace on 2023-08-09T16:38:17Z (GMT). No. of bitstreams: 0en
dc.description.tableofcontentsVerification Letter from the Oral Examination Committee i
Acknowledgements ii
摘要 iii
Abstract iv
Contents v
List of Figures vii
List of Tables ix
Chapter 1 Introduction 1
Chapter 2. Background 5
2.1. Halide 5
2.2. LLVM 6
2.3. MLIR. 7
2.4. Affine transformation 9
Chapter 3 Motivation. 13
3.1. Motivation. 13
Chapter 4 Design and Implementation 16
4.1 HTM.................................. 16
4.2 stochastic optimization algorithm ................... 17
4.2.1 Simulatedannealing.......................... 17
4.2.1.1 Markov decision process ................ 19
4.3 code generation 20
Chapter 5 Evaluation and Discussion 23
5.1 Environment Setup. 23
5.2 Experiment Steps. 24
5.3 Result. 25
5.4 Discussion. 31
Chapter 6 Conclusion and Future work. 35
6.1 Conclusion. 35
6.2 Related work. 37
6.3 Future work. 39
References. 41
-
dc.language.isoen-
dc.subject隨機優化演算法zh_TW
dc.subject仿射轉換zh_TW
dc.subjectMLIRzh_TW
dc.subjectHalidezh_TW
dc.subjectHalideen
dc.subjectMLIRen
dc.subjectAffine transformationen
dc.subjectstochastic optimization algorithmen
dc.title在MLIR上透過隨機優化演算法自動化Affine排程並加速Halidezh_TW
dc.titleAccelerating Halide framework by automatic Affine scheduling using the stochastic optimization algorithm on MLIRen
dc.typeThesis-
dc.date.schoolyear111-2-
dc.description.degree碩士-
dc.contributor.oralexamcommittee陸忠立;黃敬群;鄭振牟;黎士瑋zh_TW
dc.contributor.oralexamcommitteeZhong-Li Lu;Jing-Qun Huang;Chen-Mou Cheng;Shih-Wei Lien
dc.subject.keywordHalide,MLIR,仿射轉換,隨機優化演算法,zh_TW
dc.subject.keywordHalide,MLIR,Affine transformation,stochastic optimization algorithm,en
dc.relation.page45-
dc.identifier.doi10.6342/NTU202300704-
dc.rights.note同意授權(限校園內公開)-
dc.date.accepted2023-07-27-
dc.contributor.author-college電機資訊學院-
dc.contributor.author-dept資訊網路與多媒體研究所-
dc.date.embargo-lift2025-07-24-
顯示於系所單位:資訊網路與多媒體研究所

文件中的檔案:
檔案 大小格式 
ntu-111-2.pdf
授權僅限NTU校內IP使用(校園外請利用VPN校外連線服務)
1.13 MBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved