Skip navigation

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets

Learn More
DSpace logo
English
中文
  • Browse
    • Communities
      & Collections
    • Publication Year
    • Author
    • Title
    • Subject
    • Advisor
  • Search TDR
  • Rights Q&A
    • My Page
    • Receive email
      updates
    • Edit Profile
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊工程學系
Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/85582
Title: 藉由預測圖片難易度在卷積神經網路上有效率的推論
Efficient Inference on Convolutional Neural Networks by Image Difficulty Prediction
Authors: Yu-Jen Chang
張佑任
Advisor: 劉邦鋒(Pangfeng Liu)
Keyword: 卷積神經網路,機器學習,圖片難易度,有效率的推論,電腦視覺,
Convolutional Neural Networks,Machine Learning,Image Difficulty,Efficient Inference,Computer Vision,
Publication Year : 2022
Degree: 碩士
Abstract: 此篇論文提出一個方法,使得我們可以預測一張圖片分類的難易度並根據預測結果縮小一張圖片來加速模型的推論時間。我們觀察到模型像是ResNet-50和EfficientNet仍然可以在一張圖片縮小的情況下正確的分類。我們把這些縮小後可以正確分類的圖片當作簡單的圖片,其餘的當作複雜的圖片。於是我們就蒐集了帶有不同難易度的圖片,並訓練一個「難易度」模型,這個模型可以分類一張圖片的難易度,並根據難易度縮小一張圖片。此外我們使用「推論」模型去分類一張圖片的種類,這些「推論」模型會專注於特定的圖片大小去提高他們的準確度。最後我們連結「難易度」模型和「推論」模型來得到「混合」模型。我們的實驗使用輕型的MobileNetV3-small當作「難易度」的模型,並使用ResNet-50和EfficientNet-B4當作「推論」模型。我們實驗結果指出一個推論時間和圖片分類準確度的平衡,而「難易度」模型的信心水準閾值會影響這個平衡。若「難易度」模型的信心水準閾值越高,推論時間和圖片分類準確度會提升。反之若「難易度」模型的信心水準閾值越低,推論時間和圖片分類準確度會降低。因此,使用者可以調節「難易度」模型的信心水準閾值來影響「混合」模型的行為,進而去尋找在推論時間和分類準確度之間的一個客製化平衡。
This paper introduces a scheme that predicts the difficulty of classifying an image, reduces the image size according to the prediction, and speeds up the inference time. We observe that models such as ResNet-50 and EfficientNet can classify specific images correctly even after downsizing. We consider these correctly classified images as easy images and others as complex images. Then we collect images with different difficulties and train a difficulty model that classifies the difficulty of an image and determines whether we should downsize an image. In addition, we use an inference model that consists of multiple models for classifying images of different image sizes, and each model is trained with specific datasets to increase its accuracy for the particular image sizes. Finally, we concatenate the difficulty and inference models to get the hybrid model. Our experiments use MobileNetV3-small as the lightweight difficulty model, and ResNet-50 and EfficientNet-B4 as the inference models. Experimental results indicate a trade-off between the inference time and the image classification accuracy, and the confidence threshold of the difficulty model affects this trade-off. If the confidence threshold of the difficulty model is high/low, the inference time and the image classification accuracy increase/decrease. As a result, the user can control the behavior of the hybrid model by adjusting the confidence threshold of the difficulty model and finding a customized balance between the inference time and the classification accuracy.
URI: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/85582
DOI: 10.6342/NTU202201202
Fulltext Rights: 同意授權(全球公開)
metadata.dc.date.embargo-lift: 2022-07-05
Appears in Collections:資訊工程學系

Files in This Item:
File SizeFormat 
U0001-2906202214521000.pdf2.83 MBAdobe PDFView/Open
Show full item record


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved