藉由預測圖片難易度在卷積神經網路上有效率的推論

Yu-Jen Chang; 張佑任

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/85582

標題:	藉由預測圖片難易度在卷積神經網路上有效率的推論 Efficient Inference on Convolutional Neural Networks by Image Difficulty Prediction
作者:	Yu-Jen Chang 張佑任
指導教授:	劉邦鋒(Pangfeng Liu)
關鍵字:	卷積神經網路,機器學習,圖片難易度,有效率的推論,電腦視覺, Convolutional Neural Networks,Machine Learning,Image Difficulty,Efficient Inference,Computer Vision,
出版年 :	2022
學位:	碩士
摘要:	此篇論文提出一個方法，使得我們可以預測一張圖片分類的難易度並根據預測結果縮小一張圖片來加速模型的推論時間。我們觀察到模型像是ResNet-50和EfficientNet仍然可以在一張圖片縮小的情況下正確的分類。我們把這些縮小後可以正確分類的圖片當作簡單的圖片，其餘的當作複雜的圖片。於是我們就蒐集了帶有不同難易度的圖片，並訓練一個「難易度」模型，這個模型可以分類一張圖片的難易度，並根據難易度縮小一張圖片。此外我們使用「推論」模型去分類一張圖片的種類，這些「推論」模型會專注於特定的圖片大小去提高他們的準確度。最後我們連結「難易度」模型和「推論」模型來得到「混合」模型。我們的實驗使用輕型的MobileNetV3-small當作「難易度」的模型，並使用ResNet-50和EfficientNet-B4當作「推論」模型。我們實驗結果指出一個推論時間和圖片分類準確度的平衡，而「難易度」模型的信心水準閾值會影響這個平衡。若「難易度」模型的信心水準閾值越高，推論時間和圖片分類準確度會提升。反之若「難易度」模型的信心水準閾值越低，推論時間和圖片分類準確度會降低。因此，使用者可以調節「難易度」模型的信心水準閾值來影響「混合」模型的行為，進而去尋找在推論時間和分類準確度之間的一個客製化平衡。 This paper introduces a scheme that predicts the difficulty of classifying an image, reduces the image size according to the prediction, and speeds up the inference time. We observe that models such as ResNet-50 and EfficientNet can classify specific images correctly even after downsizing. We consider these correctly classified images as easy images and others as complex images. Then we collect images with different difficulties and train a difficulty model that classifies the difficulty of an image and determines whether we should downsize an image. In addition, we use an inference model that consists of multiple models for classifying images of different image sizes, and each model is trained with specific datasets to increase its accuracy for the particular image sizes. Finally, we concatenate the difficulty and inference models to get the hybrid model. Our experiments use MobileNetV3-small as the lightweight difficulty model, and ResNet-50 and EfficientNet-B4 as the inference models. Experimental results indicate a trade-off between the inference time and the image classification accuracy, and the confidence threshold of the difficulty model affects this trade-off. If the confidence threshold of the difficulty model is high/low, the inference time and the image classification accuracy increase/decrease. As a result, the user can control the behavior of the hybrid model by adjusting the confidence threshold of the difficulty model and finding a customized balance between the inference time and the classification accuracy.
URI:	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/85582
DOI:	10.6342/NTU202201202
全文授權:	同意授權(全球公開)
電子全文公開日期:	2022-07-05
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
U0001-2906202214521000.pdf	2.83 MB	Adobe PDF	檢視/開啟

顯示文件完整紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料（如：文字、圖片、PDF）並使其易於取用。