動態形狀卷積類神經網路

邱聖約; Sheng-Yueh Chiu

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/91568

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	劉邦鋒	zh_TW
dc.contributor.advisor	Pangfeng Liu	en
dc.contributor.author	邱聖約	zh_TW
dc.contributor.author	Sheng-Yueh Chiu	en
dc.date.accessioned	2024-01-28T16:34:11Z	-
dc.date.available	2024-02-24	-
dc.date.copyright	2024-01-28	-
dc.date.issued	2023	-
dc.date.submitted	2023-08-09	-
dc.identifier.citation	M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S. Corrado, A. Davis, J. Dean, M. Devin, S. Ghemawat, I. Goodfellow, A. Harp, G. Irving, M. Isard, Y. Jia, R. Jozefowicz, L. Kaiser, M. Kudlur, J. Levenberg, D. Mané, R. Monga,S. Moore, D. Murray, C. Olah, M. Schuster, J. Shlens, B. Steiner, I. Sutskever, K. Talwar, P. Tucker, V. Vanhoucke, V. Vasudevan, F. Viégas, O. Vinyals, P. Warden, M. Wattenberg, M. Wicke, Y. Yu, and X. Zheng. TensorFlow: Largescale machine learning on heterogeneous systems, 2015. Software available from tensorflow.org. Y.J. Chang, D.Y. Hong, P. Liu, and J.J. Wu. Efficient inference on convolutional neural networks by image difficulty prediction. In 2022 IEEE International Conference on Big Data (Big Data), pages 5672–5681, 2022. S. Chetlur, C. Woolley, P. Vandermersch, J. Cohen, J. Tran, B. Catanzaro, and E. Shelhamer. cudnn: Efficient primitives for deep learning. CoRR, abs/1410.0759, 2014. I. J. Goodfellow, J. PougetAbadie, M. Mirza, B. Xu, D. WardeFarley, S. Ozair, A. Courville, and Y. Bengio. Generative adversarial nets. In Proceedings of the 27th International Conference on Neural Information Processing Systems Volume 2, NIPS’14, page 2672–2680, Cambridge, MA, USA, 2014. MIT Press. S. Han, H. Mao, and W. J. Dally. Deep compression: Compressing deep neural network with pruning, trained quantization and huffman coding. In Y. Bengio and Y. LeCun, editors, 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, May 24, 2016, Conference Track Proceedings, 2016. K. He, G. Gkioxari, P. Dollar, and R. Girshick. Mask rcnn. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Oct 2017. K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 770–778, 2016. A. Howard, M. Sandler, G. Chu, L.C. Chen, B. Chen, M. Tan, W. Wang, Y. Zhu, R. Pang, V. Vasudevan, Q. V. Le, and H. Adam. Searching for mobilenetv3, 2019. Itseez. The OpenCV Reference Manual, 2.4.9.0 edition, April 2014. Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, and T. Darrell. Caffe: Convolutional architecture for fast feature embedding. In Proceedings of the 22nd ACM International Conference on Multimedia, MM ’14, page 675–678, New York, NY, USA, 2014. Association for Computing Machinery. A. Lavin and S. Gray. Fast algorithms for convolutional neural networks. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 4013–4021, June 2016. T. Lin, M. Maire, S. J. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, and C. L. Zitnick. Microsoft COCO: common objects in context. In D. J. Fleet, T. Pajdla, B. Schiele, and T. Tuytelaars, editors, Computer Vision ECCV 2014 13th European Conference, Zurich, Switzerland, September 612, 2014, Proceedings, Part V, volume 8693 of Lecture Notes in Computer Science, pages 740–755. Springer, 2014. S. Liu and W. Deng. Very deep convolutional neural network based image classification using small training sample size. In 2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR), pages 730–734, 2015. W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. E. Reed, C. Fu, and A. C. Berg. SSD: single shot multibox detector. CoRR, abs/1512.02325, 2015. A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, A. Desmaison, A. Köpf, E. Yang, Z. DeVito, M. Raison, A. Tejani, S. Chilamkurthy, B. Steiner, L. Fang, J. Bai, and S. Chintala. PyTorch: An Imperative Style, HighPerformance Deep Learning Library. Curran Associates Inc., Red Hook, NY, USA, 2019. J. Redmon and A. Farhadi. Yolov3: An incremental improvement. CoRR, abs/1804.02767, 2018. S. Ren, K. He, R. Girshick, and J. Sun. Faster rcnn: Towards realtime object detection with region proposal networks. In Proceedings of the 28th International Conference on Neural Information Processing Systems Volume 1, NIPS’15, page 91–99, Cambridge, MA, USA, 2015. MIT Press. O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg, and L. FeiFei. ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision (IJCV), 115(3):211–252, 2015. C. Szegedy, S. Ioffe, V. Vanhoucke, and A. A. Alemi. Inceptionv4, inceptionresnet and the impact of residual connections on learning. In Proceedings of the ThirtyFirst AAAI Conference on Artificial Intelligence, AAAI’17, page 4278–4284. AAAI Press, 2017. C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. E. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich. Going deeper with convolutions. CoRR, abs/1409.4842, 2014. M. Tan and Q. Le. EfficientNet: Rethinking model scaling for convolutional neural networks. In K. Chaudhuri and R. Salakhutdinov, editors, Proceedings of the 36th International Conference on Machine Learning, volume 97 of Proceedings of Machine Learning Research, pages 6105–6114. PMLR, 09–15 Jun 2019. S. Xie, R. Girshick, P. Dollar, Z. Tu, and K. He. Aggregated residual transformations for deep neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017.	-
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/91568	-
dc.description.abstract	儘管卷積神經網絡 (CNN) 架構和機器學習框架取得了重大進展，但大多數深度學習框架在進行計算機視覺任務時，只能定義一個 CNN 模型來處理相同形狀的圖像批次，這在很大程度上限制了 AI 應用的設計。在執行訓練或推理階段，圖像會在進入卷積層之前被轉換成特定的大小，作為批次輸入。雖然批次推理由於高 GPU 利用率而具有良好的吞吐量，但在輸入圖像高而瘦的情況下，可能會導致低準確度。我們觀察到一些應用需要進行批次推理，而不需調整圖像的大小，例如 ResizeNet 模型和物體檢測模型。在本文中，我們提出了一種深度學習框架，該框架可以定義一個 CNN 模型，同時處理批次中形狀不同的圖像。與最先進的深度學習框架不同，我們實現的 CNN 模型中的神經網絡層不需要固定維度的四維輸入。我們通過將 ResizeNet 模型的推理部分替換為我們實現的模型，實現了最高達4.35倍的加速，同時僅略微降低了準確度。	zh_TW
dc.description.abstract	Despite the advances in convolutional neural network (CNN) architectures and machine learning frameworks, most deep learning frameworks can only define a CNN model to process batches of images with the same shape when doing computer vision tasks, which limits the design of AI applications significantly. Images are transformed to a specific size before being fed into the convolutional layer as a batch in either the training or inference phase. Although the batch inference could have good throughput due to high GPU utilization, it can lead to low accuracy when the input images are tall and skinny. We observe that some applications need batch inference without resizing the images, for example, ResizeNet model and object detection models. In this paper, We present a deep learning framework with which a CNN model can be defined to process a batch of images with different shapes at the same time. Unlike state of the art deep learning frameworks, neural network layers in a CNN model defined with our implementation do not require the 4-dimensional input to have fixed dimensions. We modify the ResizeNet model by replacing the inference part with a model we implemented and achieve a speedup of up to 4.35 with little accuracy lost.	en
dc.description.provenance	Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2024-01-28T16:34:11Z No. of bitstreams: 0	en
dc.description.provenance	Made available in DSpace on 2024-01-28T16:34:11Z (GMT). No. of bitstreams: 0	en
dc.description.tableofcontents	口試委員審定書 i 致謝 ii 摘要 iii Abstract iv Contents vi List of Figures vii List of Tables viii Chapter 1 Introduction 1 Chapter 2 Related Work 7 Chapter 3 Method 11 Chapter 4 Evaluation 14 4.1 Experiment Settings 14 4.2 Performance of modified ResizeNet 14 4.3 Performance of ResNet50 built with our framework 17 Chapter 5 Conclusion 20 References 21	-
dc.language.iso	en	-
dc.subject	卷積神經網路	zh_TW
dc.subject	深度學習	zh_TW
dc.subject	動態形狀推理	zh_TW
dc.subject	圖形處理器	zh_TW
dc.subject	優化	zh_TW
dc.subject	Optimization	en
dc.subject	Graphics Processing Unit	en
dc.subject	Dynamic Shape Inference	en
dc.subject	Convolutional Neural Network	en
dc.subject	Deep Learning	en
dc.title	動態形狀卷積類神經網路	zh_TW
dc.title	A Convolutional Neural Network for Dynamic Shape Inference	en
dc.type	Thesis	-
dc.date.schoolyear	111-2	-
dc.description.degree	碩士	-
dc.contributor.oralexamcommittee	洪鼎詠;吳真貞	zh_TW
dc.contributor.oralexamcommittee	Ding-Yong Hong;Jan-Jan Wu	en
dc.subject.keyword	深度學習,卷積神經網路,優化,圖形處理器,動態形狀推理,	zh_TW
dc.subject.keyword	Deep Learning,Convolutional Neural Network,Optimization,Graphics Processing Unit,Dynamic Shape Inference,	en
dc.relation.page	24	-
dc.identifier.doi	10.6342/NTU202303946	-
dc.rights.note	未授權	-
dc.date.accepted	2023-08-11	-
dc.contributor.author-college	電機資訊學院	-
dc.contributor.author-dept	資訊工程學系	-
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
ntu-111-2.pdf 未授權公開取用	5.98 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。