請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/85582完整後設資料紀錄
| DC 欄位 | 值 | 語言 |
|---|---|---|
| dc.contributor.advisor | 劉邦鋒(Pangfeng Liu) | |
| dc.contributor.author | Yu-Jen Chang | en |
| dc.contributor.author | 張佑任 | zh_TW |
| dc.date.accessioned | 2023-03-19T23:19:02Z | - |
| dc.date.copyright | 2022-07-05 | |
| dc.date.issued | 2022 | |
| dc.date.submitted | 2022-07-01 | |
| dc.identifier.citation | [1] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” 2015. [Online]. Available: https://arxiv.org/abs/1512.03385 [2] A. Howard, M. Sandler, G. Chu, L.-C. Chen, B. Chen, M. Tan, W. Wang, Y. Zhu, R. Pang, V. Vasudevan, Q. V. Le, and H. Adam, “Searching for mobilenetv3,” 2019. [Online]. Available: https://arxiv.org/abs/1905.02244 [3] M. Tan and Q. V. Le, “Efficientnet: Rethinking model scaling for convolutional neural networks,” CoRR, vol. abs/1905.11946, 2019. [Online]. Available: http://arxiv.org/abs/1905.11946 [4] K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” 2014. [Online]. Available: https://arxiv.org/abs/1409.1556 [5] F. Chollet, “Xception: Deep learning with depthwise separable convolutions,” CoRR, vol. abs/1610.02357, 2016. [Online]. Available: http://arxiv.org/abs/1610.02357 [6] G. Huang, Z. Liu, and K. Q. Weinberger, “Densely connected convolutional networks,” CoRR, vol. abs/1608.06993, 2016. [Online]. Available: http://arxiv.org/abs/1608.06993 [7] O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” in International Conference on Medical image computing and computer-assisted intervention. Springer, 2015, pp. 234–241. [8] J. Long, E. Shelhamer, and T. Darrell, “Fully convolutional networks for semantic segmentation,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 3431–3440. [9] A. Bochkovskiy, C.-Y. Wang, and H.-Y. M. Liao, “Yolov4: Optimal speed and accuracy of object detection,” 2020. [10] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, “Going deeper with convolutions,” 2014. [Online]. Available: https://arxiv.org/abs/1409.4842 [11] S. Ren, K. He, R. Girshick, and J. Sun, “Faster r-cnn: Towards real-time object detection with region proposal networks,” 2015. [Online]. Available: https://arxiv.org/abs/1506.01497 [12] K. He, G. Gkioxari, P. Dollár, and R. Girshick, “Mask r-cnn,” 2017. [Online]. Available: https://arxiv.org/abs/1703.06870 [13] J. Elson, J. J. Douceur, J. Howell, and J. Saul, “Asirra: A captcha that exploits interest-aligned manual image categorization,” in Proceedings of 14th ACM Conference on Computer and Communications Security (CCS). Association for Computing Machinery, Inc., October 2007. [Online]. Available: https://www.microsoft.com/en-us/research/publication/ asirra-a-captcha-that-exploits-interest-aligned-manual-image-categorization/ [14] L. Bossard, M. Guillaumin, and L. Van Gool, “Food-101 – mining discriminative components with random forests,” in European Conference on Computer Vision, 2014. [15] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “Imagenet: A large-scale hierar- chical image database,” in 2009 IEEE conference on computer vision and pattern recognition. Ieee, 2009, pp. 248–255. [16] R. T. Ionescu, B. Alexe, M. Leordeanu, M. Popescu, D. P. Papadopoulos, and V. Ferrari, “How hard can it be? estimating the difficulty of visual search in an image,” CoRR, vol. abs/1705.08280, 2017. [Online]. Available: http://arxiv.org/abs/1705.08280 [17] P. Soviany and R. T. Ionescu, “Optimizing the trade-off between single-stage and two-stage object detectors using image difficulty prediction,” CoRR, vol. abs/1803.08707, 2018. [Online]. Available: http://arxiv.org/abs/1803.08707 [18] S. Vijayanarasimhan and K. Grauman, “What’s it going to cost you?: Predicting effort vs. informativeness for multi-label image annotations,” in 2009 IEEE Conference on Computer Vision and Pattern Recognition, 2009, pp. 2262–2269. [19] D. Liu, Y. Xiong, K. Pulli, and L. Shapiro, “Estimating image segmentation difficulty,” in Proceedings of the 7th International Conference on Machine Learning and Data Mining in Pattern Recognition, ser. MLDM’11. Berlin, Heidelberg: Springer-Verlag, 2011, p. 484–495. [20] F. Scheidegger, R. Istrate, G. Mariani, L. Benini, C. Bekas, and C. Malossi, “Efficient image dataset classification difficulty estimation for predicting deep-learning accuracy,” 2018. [Online]. Available: https://arxiv.org/abs/1803.09588 [21] Y. Bengio, J. Louradour, R. Collobert, and J. Weston, “Curriculum learning,” in Proceedings of the 26th Annual International Conference on Machine Learning, ser. ICML ’09. New York, NY, USA: Association for Computing Machinery, 2009, p. 41–48. [Online]. Available: https://doi.org/10.1145/1553374.1553380 [22] P. Soviany, C. Ardei, R. T. Ionescu, and M. Leordeanu, “Image difficulty curriculum for generative adversarial networks (cugan),” CoRR, vol. abs/1910.08967, 2019. [Online]. Available: http://arxiv.org/abs/1910.08967 [23] Y. J. Lee and K. Grauman, “Learning the easy things first: Self-paced visual category discovery,” in CVPR 2011, 2011, pp. 1721–1728. [24] J. Wang, X. Wang, and W. Liu, “Weakly- and semi-supervised faster r-cnn with curriculum learning,” in 2018 24th International Conference on Pattern Recognition (ICPR), 2018, pp. 2416–2421.\ [25] M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S. Corrado, A. Davis, J. Dean, M. Devin, S. Ghemawat, I. Goodfellow, A. Harp, G. Irving, M. Isard, Y. Jia, R. Jozefowicz, L. Kaiser, M. Kudlur, J. Levenberg, D. Mané, R. Monga, S. Moore, D. Murray, C. Olah, M. Schuster, J. Shlens, B. Steiner, I. Sutskever, K. Talwar, P. Tucker, V. Vanhoucke, V. Vasudevan, F. Viégas, O. Vinyals, P. Warden, M. Wattenberg, M. Wicke, Y. Yu, and X. Zheng, “TensorFlow: Large-scale machine learning on heterogeneous systems,” 2015, software available from tensorflow.org. [Online]. Available: https://www.tensorflow.org/ [26] T. Chen, T. Moreau, Z. Jiang, L. Zheng, E. Yan, M. Cowan, H. Shen, L. Wang, Y. Hu, L. Ceze, C. Guestrin, and A. Krishnamurthy, “Tvm: An automated end-to-end optimizing compiler for deep learning,” 2018. [Online]. Available: https://arxiv.org/abs/1802.04799 [27] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” 2014. [Online]. Available: https://arxiv.org/abs/1412.6980 | |
| dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/85582 | - |
| dc.description.abstract | 此篇論文提出一個方法,使得我們可以預測一張圖片分類的難易度並根據預測結果縮小一張圖片來加速模型的推論時間。我們觀察到模型像是ResNet-50和EfficientNet仍然可以在一張圖片縮小的情況下正確的分類。我們把這些縮小後可以正確分類的圖片當作簡單的圖片,其餘的當作複雜的圖片。於是我們就蒐集了帶有不同難易度的圖片,並訓練一個「難易度」模型,這個模型可以分類一張圖片的難易度,並根據難易度縮小一張圖片。此外我們使用「推論」模型去分類一張圖片的種類,這些「推論」模型會專注於特定的圖片大小去提高他們的準確度。最後我們連結「難易度」模型和「推論」模型來得到「混合」模型。我們的實驗使用輕型的MobileNetV3-small當作「難易度」的模型,並使用ResNet-50和EfficientNet-B4當作「推論」模型。我們實驗結果指出一個推論時間和圖片分類準確度的平衡,而「難易度」模型的信心水準閾值會影響這個平衡。若「難易度」模型的信心水準閾值越高,推論時間和圖片分類準確度會提升。反之若「難易度」模型的信心水準閾值越低,推論時間和圖片分類準確度會降低。因此,使用者可以調節「難易度」模型的信心水準閾值來影響「混合」模型的行為,進而去尋找在推論時間和分類準確度之間的一個客製化平衡。 | zh_TW |
| dc.description.abstract | This paper introduces a scheme that predicts the difficulty of classifying an image, reduces the image size according to the prediction, and speeds up the inference time. We observe that models such as ResNet-50 and EfficientNet can classify specific images correctly even after downsizing. We consider these correctly classified images as easy images and others as complex images. Then we collect images with different difficulties and train a difficulty model that classifies the difficulty of an image and determines whether we should downsize an image. In addition, we use an inference model that consists of multiple models for classifying images of different image sizes, and each model is trained with specific datasets to increase its accuracy for the particular image sizes. Finally, we concatenate the difficulty and inference models to get the hybrid model. Our experiments use MobileNetV3-small as the lightweight difficulty model, and ResNet-50 and EfficientNet-B4 as the inference models. Experimental results indicate a trade-off between the inference time and the image classification accuracy, and the confidence threshold of the difficulty model affects this trade-off. If the confidence threshold of the difficulty model is high/low, the inference time and the image classification accuracy increase/decrease. As a result, the user can control the behavior of the hybrid model by adjusting the confidence threshold of the difficulty model and finding a customized balance between the inference time and the classification accuracy. | en |
| dc.description.provenance | Made available in DSpace on 2023-03-19T23:19:02Z (GMT). No. of bitstreams: 1 U0001-2906202214521000.pdf: 2898107 bytes, checksum: 68671a69bbd28fde44b0d656c8961157 (MD5) Previous issue date: 2022 | en |
| dc.description.tableofcontents | 口試委員審定書 i 誌謝 iii 摘要 iv Abstract v 目 錄 vi 圖目錄 vii 表目錄 viii 1. Introduction 1 2. Related Work 5 2.1 Image Difficulty Estimation 5 2.2 Image Difficulty Applications 5 2.3 Relation to our Work 6 3. Scheme 7 3.1 Resizing Method 7 3.2 Difficulty Model 8 3.3 Inference Model 10 4. Evaluation 11 4.1 Experiment Settings 11 4.2 ImageNet 12 5. Conclusion 19 References 21 A. Appendix 25 A.1 Dogs vs. Cats 25 A.2 Food 101 26 | |
| dc.language.iso | en | |
| dc.subject | 機器學習 | zh_TW |
| dc.subject | 圖片難易度 | zh_TW |
| dc.subject | 卷積神經網路 | zh_TW |
| dc.subject | 電腦視覺 | zh_TW |
| dc.subject | 有效率的推論 | zh_TW |
| dc.subject | Convolutional Neural Networks | en |
| dc.subject | Computer Vision | en |
| dc.subject | Efficient Inference | en |
| dc.subject | Image Difficulty | en |
| dc.subject | Machine Learning | en |
| dc.title | 藉由預測圖片難易度在卷積神經網路上有效率的推論 | zh_TW |
| dc.title | Efficient Inference on Convolutional Neural Networks by Image Difficulty Prediction | en |
| dc.type | Thesis | |
| dc.date.schoolyear | 110-2 | |
| dc.description.degree | 碩士 | |
| dc.contributor.oralexamcommittee | 吳真貞(Jan-jan Wu),洪鼎詠(Ding-Yong Hong) | |
| dc.subject.keyword | 卷積神經網路,機器學習,圖片難易度,有效率的推論,電腦視覺, | zh_TW |
| dc.subject.keyword | Convolutional Neural Networks,Machine Learning,Image Difficulty,Efficient Inference,Computer Vision, | en |
| dc.relation.page | 30 | |
| dc.identifier.doi | 10.6342/NTU202201202 | |
| dc.rights.note | 同意授權(全球公開) | |
| dc.date.accepted | 2022-07-04 | |
| dc.contributor.author-college | 電機資訊學院 | zh_TW |
| dc.contributor.author-dept | 資訊工程學研究所 | zh_TW |
| dc.date.embargo-lift | 2022-07-05 | - |
| 顯示於系所單位: | 資訊工程學系 | |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| U0001-2906202214521000.pdf | 2.83 MB | Adobe PDF | 檢視/開啟 |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
