視覺影像評鑑：用於評價攝影中的散景和整體影像質量的模型

簡哲元; Che-Yuan Chien

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/92768

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	廖世偉	zh_TW
dc.contributor.advisor	Shih-Wei Liao	en
dc.contributor.author	簡哲元	zh_TW
dc.contributor.author	Che-Yuan Chien	en
dc.date.accessioned	2024-06-21T16:10:44Z	-
dc.date.available	2024-06-22	-
dc.date.copyright	2024-06-21	-
dc.date.issued	2024	-
dc.date.submitted	2024-06-19	-
dc.identifier.citation	[1] J. Achiam, S. Adler, S. Agarwal, L. Ahmad, I. Akkaya, F. L. Aleman, D. Almeida, J. Altenschmidt, S. Altman, S. Anadkat, et al. Gpt-4 technical report. arXiv preprint arXiv:2303.08774, 2023. [2] E. Almazrouei, H. Alobeidli, A. Alshamsi, A. Cappelli, R. Cojocaru, M. Debbah, É. Goffinet, D. Hesslow, J. Launay, Q. Malartic, et al. The falcon series of open language models. arXiv preprint arXiv:2311.16867, 2023. [3] C. Chen, J. Mo, J. Hou, H. Wu, L. Liao, W. Sun, Q. Yan, and W. Lin. Topiq: A top- down approach from semantics to distortions for image quality assessment. IEEE Transactions on Image Processing, 2024. [4] H. Chen, F. Shao, B. Mu, and Q. Jiang. Image aesthetics assessment with emotion-aware multi-branch network. IEEE Transactions on Instrumentation and Measurement, 2024. [5] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. Imagenet: A large- scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, pages 248–255. Ieee, 2009. [6] Y. Deng, C. C. Loy, and X. Tang. Image aesthetic assessment: An experimental survey. IEEE Signal Processing Magazine, 34(4):80–106, 2017. [7] Y. Fang, H. Zhu, Y. Zeng, K. Ma, and Z. Wang. Perceptual quality assessment of smartphone photography. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 3677–3686, 2020. [8] D. Ghadiyaram and A. C. Bovik. Massive online crowdsourced study of subjective and objective picture quality. IEEE Transactions on Image Processing, 25(1):372– 387, 2015. [9] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016. [10] S. He, A. Ming, S. Zheng, H. Zhong, and H. Ma. Eat: An enhancer for aesthetics- oriented transformers. In Proceedings of the 31st ACM International Conference on Multimedia, pages 1023–1032, 2023. [11] S. He, Y. Zhang, R. Xie, D. Jiang, and A. Ming. Rethinking image aesthetics assess- ment: Models, datasets and benchmarks. In IJCAI, pages 942–948, 2022. [12] V.Hosu,B.Goldlucke,andD.Saupe.Effectiveaestheticspredictionwithmulti-level spatially pooled features. In proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9375–9383, 2019. [13] V. Hosu, H. Lin, T. Sziranyi, and D. Saupe. Koniq-10k: An ecologically valid database for deep learning of blind image quality assessment. IEEE Transactions on Image Processing, 29:4041–4056, 2020. [14] H. Hu, Z. Zhang, Z. Xie, and S. Lin. Local relation networks for image recognition. In Proceedings of the IEEE/CVF international conference on computer vision, pages 3464–3473, 2019. [15] X.HuangandS.Belongie.Arbitrarystyletransferinreal-timewithadaptiveinstance normalization. In Proceedings of the IEEE international conference on computer vision, pages 1501–1510, 2017. [16] A. Ignatov, J. Patel, and R. Timofte. Rendering natural camera bokeh effect with deep learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pages 418–419, 2020. [17] D.P.KingmaandJ.Ba.Adam:Amethodforstochasticoptimization.arXivpreprint arXiv:1412.6980, 2014. [18] L.Li,H.Zhu,S.Zhao,G.Ding,andW.Lin.Personality-assistedmulti-tasklearning for generic and personalized image aesthetics assessment. IEEE Transactions on Image Processing, 29:3898–3910, 2020. [19] W. Li, Y. Peng, M. Zhang, L. Ding, H. Hu, and L. Shen. Deep model fusion: A survey. arXiv preprint arXiv:2309.15698, 2023. [20] H. Lin, V. Hosu, and D. Saupe. Kadid-10k: A large-scale artificially distorted iqa database. In 2019 Eleventh International Conference on Quality of Multimedia Experience (QoMEX), pages 1–3. IEEE, 2019. [21] S. Liu, T. Lin, D. He, F. Li, M. Wang, X. Li, Z. Sun, Q. Li, and E. Ding. Adaattn: Revisit attention mechanism in arbitrary neural style transfer. In Proceedings of the IEEE/CVF international conference on computer vision, pages 6649–6658, 2021. [22] Z. Liu, Y. Lin, Y. Cao, H. Hu, Y. Wei, Z. Zhang, S. Lin, and B. Guo. Swin trans- former: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF international conference on computer vision, pages 10012–10022, 2021. [23] H. Lou, H. Huang, C. Xiao, and X. Jin. Aesthetic evaluation and guidance for mo- bile photography. In Proceedings of the 29th ACM International Conference on Multimedia, pages 2780–2782, 2021. [24] N. Murray, L. Marchesotti, and F. Perronnin. Ava: A large-scale database for aes- thetic visual analysis. In 2012 IEEE conference on computer vision and pattern recognition, pages 2408–2415. IEEE, 2012. [25] D. Y. Park and K. H. Lee. Arbitrary style transfer with style-attentional net- works. In proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 5880–5888, 2019. [26] J. Pfister, K. Kobs, and A. Hotho. Self-supervised multi-task pretraining im- proves image aesthetic assessment. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 816–825, 2021. [27] O.Russakovsky,J.Deng,H.Su,J.Krause,S.Satheesh,S.Ma,Z.Huang,A.Karpa- thy, A. Khosla, M. Bernstein, et al. Imagenet large scale visual recognition challenge. International journal of computer vision, 115:211–252, 2015. [28] D. She, Y.-K. Lai, G. Yi, and K. Xu. Hierarchical layout-aware graph convolu- tional network for unified aesthetics assessment. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 8475–8484, 2021. [29] K. Sheng, W. Dong, C. Ma, X. Mei, F. Huang, and B.-G. Hu. Attention-based multi- patch aggregation for image aesthetic assessment. In Proceedings of the 26th ACM international conference on Multimedia, pages 879–886, 2018. [30] T. Shi, C. Chen, X. Li, and A. Hao. Semantic and style based multiple reference learning for artistic and general image aesthetic assessment. Neurocomputing, page 127434, 2024. [31] S. P. Singh and M. Jaggi. Model fusion via optimal transport. Advances in Neural Information Processing Systems, 33:22045–22055, 2020. [32] H. Talebi and P. Milanfar. Nima: Neural image assessment. IEEE transactions on image processing, 27(8):3998–4011, 2018. [33] H. Touvron, T. Lavril, G. Izacard, X. Martinet, M.-A. Lachaux, T. Lacroix, B. Rozière, N. Goyal, E. Hambro, F. Azhar, et al. Llama: Open and efficient foun- dation language models. arXiv preprint arXiv:2302.13971, 2023. [34] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin. Attention is all you need. Advances in neural information processing systems, 30, 2017. [35] Y. Wan, W. Li, X. Wu, J. Xu, and J. Yang. Automatic image aesthetic assessment for human-designed digital images. In Proceedings of the 1st International Workshop on Multimedia Content Generation and Evaluation: New Methods and Practice, pages 1–8, 2023. [36] J. Wang, K. C. Chan, and C. C. Loy. Exploring clip for assessing the look and feel of images. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37, pages 2555–2563, 2023. [37] X. Wang, R. Girshick, A. Gupta, and K. He. Non-local neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 7794–7803, 2018. [38] S. Woo, J. Park, J.-Y. Lee, and I. S. Kweon. Cbam: Convolutional block attention module. In Proceedings of the European conference on computer vision (ECCV), pages 3–19, 2018. [39] H. Wu, E. Zhang, L. Liao, C. Chen, J. Hou, A. Wang, W. Sun, Q. Yan, and W. Lin. Towards explainable in-the-wild video quality assessment: a database and a language-prompted approach. In Proceedings of the 31st ACM International Conference on Multimedia, pages 1045–1054, 2023. [40] H. Wu, Z. Zhang, W. Zhang, C. Chen, L. Liao, C. Li, Y. Gao, A. Wang, E. Zhang, W. Sun, et al. Q-align: Teaching lmms for visual scoring via discrete text-defined levels. arXiv preprint arXiv:2312.17090, 2023. [41] Q. Ye, H. Xu, J. Ye, M. Yan, H. Liu, Q. Qian, J. Zhang, F. Huang, and J. Zhou. mplug-owl2: Revolutionizing multi-modal large language model with modality col- laboration. arXiv preprint arXiv:2311.04257, 2023. [42] R. Yi, H. Tian, Z. Gu, Y.-K. Lai, and P. L. Rosin. Towards artistic image aes- thetics assessment: a large-scale dataset and a new method. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 22388– 22397, 2023. [43] T. Zhang, H. Nefs, H. Liu, L. Xia, X. Liu, and X. Wu. Depth-of-field effect in subjective and objective evaluation of image quality. In Proceedings of the 2018 Conference on Research in Adaptive and Convergent Systems, pages 308–312, 2018. [44] W. Zhang, G. Zhai, Y. Wei, X. Yang, and K. Ma. Blind image quality assessment via vision-language correspondence: A multitask learning perspective. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 14071–14081, 2023. [45] H. Zhu, L. Li, J. Wu, S. Zhao, G. Ding, and G. Shi. Personalized image aesthetics assessment via meta-learning with bilevel gradient optimization. IEEE Transactions on Cybernetics, 52(3):1798–1811, 2020.	-
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/92768	-
dc.description.abstract	隨著多媒體和互聯網技術的快速發展，影像質量顯著提升了用戶體驗。本文重點研究散景影像的質量評估，這是一種通過創造模糊背景來突出主要主體的流行攝影技術。儘管散景效果在業餘和專業攝影師中廣泛使用，但對其質量的研究仍然有限。為了填補這一空白，我們開發了一個專門的數據集，系統地評估和比較帶有散景效果的圖像和整體質量。我們介紹了散景視覺融合網（BSFN），這是一種用於評估圖像美學的模型，並在已建立的數據集上取得了顯著成果。我們的研究包括創建了一個散景圖像評分數據集（BISD），該數據集包含有經驗的攝影師和普通用戶的評分，以及開發了BSFN模型。該模型在BAID數據集上的準確率達到了77.89%，斯皮爾曼等級相關係數（SRCC）和皮爾森相關係數（PCC）分別為0.475和0.533，達到了此類評估目前最高的準確率。我們希望這項研究能夠為進一步探索散景和美學評估奠定基礎，為提升影像質量評估盡一份心力。	zh_TW
dc.description.abstract	With the rapid advancement of multimedia and internet technology, the quality of images significantly enhances the user experience. This paper focuses on the quality assessment of bokeh images, a popular photography technique used to highlight main subjects by creating a blurred background. Despite its widespread use among both amateur and professional photographers, research on the quality of bokeh images remains limited. To address this gap, we developed a specialized dataset to systematically evaluate and compare the overall quality of images with bokeh effects. We introduce the Bokeh Sight Fusion Net (BSFN), a model designed to assess image aesthetics, which has achieved significant results on established datasets. Our research includes the creation of a Bokeh Image Scoring Dataset (BISD), enriched with ratings from experienced photographers and general users, and the development of the BSFN model. This model has demonstrated an accuracy rate of 77.89% on the BAID dataset, with Spearman’s rank correlation coefficient (SRCC) and Pearson’s correlation coefficient (PCC) of 0.475 and 0.533, respectively, marking the highest accuracy to date for such assessments. We hope this research will provide a foundation for further exploration of bokeh and aesthetic assessments, and contribute to the enhancement of image quality evaluation.	en
dc.description.provenance	Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2024-06-21T16:10:44Z No. of bitstreams: 0	en
dc.description.provenance	Made available in DSpace on 2024-06-21T16:10:44Z (GMT). No. of bitstreams: 0	en
dc.description.tableofcontents	致謝 i 摘要 ii Abstract iii Contents v List of Figures vii List of Tables ix Chapter 1 Introduction 1 Chapter 2 Related Work 5 2.1 Datasets 5 2.2 Models 7 Chapter 3 Methodology 9 3.1 BokehImageScoringDataset 9 3.1.1 Creatingthedataset 9 3.1.2 Scoringthedataset 10 3.2 BokehSightFusionNet 12 Chapter 4 Evaluation 14 4.1 Setup 14 4.2 Performanceevaluationanddiscussion 17 4.3 AblationStudy 19 Chapter 5 Conclusion 25 References 26	-
dc.language.iso	en	-
dc.title	視覺影像評鑑：用於評價攝影中的散景和整體影像質量的模型	zh_TW
dc.title	Assessing Visual Quality: A Comprehensive Model for Evaluating Bokeh Effects and Overall Quality in Photography	en
dc.type	Thesis	-
dc.date.schoolyear	112-2	-
dc.description.degree	碩士	-
dc.contributor.oralexamcommittee	傅楸善;盧瑞山	zh_TW
dc.contributor.oralexamcommittee	Chiou-Shann Fuh;Ruei-Shan Lu	en
dc.subject.keyword	視覺影像評鑑,視覺審美評鑑,散景效果,	zh_TW
dc.subject.keyword	Image Quality Assessment,Image Aesthetic Assessment,Bokeh Effects,	en
dc.relation.page	32	-
dc.identifier.doi	10.6342/NTU202401197	-
dc.rights.note	未授權	-
dc.date.accepted	2024-06-19	-
dc.contributor.author-college	電機資訊學院	-
dc.contributor.author-dept	資訊工程學系	-
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
ntu-112-2.pdf 目前未授權公開取用	29.63 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。