請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/93412完整後設資料紀錄
| DC 欄位 | 值 | 語言 |
|---|---|---|
| dc.contributor.advisor | 林守德 | zh_TW |
| dc.contributor.advisor | Shou-De Lin | en |
| dc.contributor.author | 潘建琿 | zh_TW |
| dc.contributor.author | Felix Liawi | en |
| dc.date.accessioned | 2024-07-31T16:12:02Z | - |
| dc.date.available | 2024-08-01 | - |
| dc.date.copyright | 2024-07-31 | - |
| dc.date.issued | 2024 | - |
| dc.date.submitted | 2024-07-26 | - |
| dc.identifier.citation | [1] Midjourney. https://www.midjourney.com/home, 2022.
[2] G. Bradski. The OpenCV Library. Dr. Dobb’s Journal of Software Tools, 2000. [3] T. B. Brown, B. Mann, N. Ryder, M. Subbiah, J. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell, S. Agarwal, A. Herbert-Voss, G. Krueger, T. Henighan, R. Child, A. Ramesh, D. M. Ziegler, J. Wu, C. Winter, C. Hesse, M. Chen, E. Sigler, M. Litwin, S. Gray, B. Chess, J. Clark, C. Berner, S. McCandlish, A. Radford, I. Sutskever, and D. Amodei. Language models are few-shot learners, 2020. [4] Civitai. Realistic vision v5.1-inpainting (vae). https://civitai.com/models/ 4201?modelVersionId=130090, 2023. [5] Q. Dong, L. Li, D. Dai, C. Zheng, Z. Wu, B. Chang, X. Sun, J. Xu, L. Li, and Z. Sui. A survey on in-context learning, 2023. [6] A. E. Eshratifar, J. V. Soares, K. Thadani, S. Mishra, M. Kuznetsov, Y.-N. Ku, and P. de Juan. Salient object-aware background generation using text-guided diffusion models. arXiv preprint arXiv:2404.10157, 2024. [7] S. Gu, J. Bao, D. Chen, and F. Wen. Giqa: Generated image quality assessment, 2020. [8] J.Hessel,A.Holtzman,M.Forbes,R.Bras,andY.Choi.Clipscore:Areference-free evaluation metric for image captioning. arXiv preprint arXiv:2104.08718, 2021. [9] V.Hosu,B.Goldlucke,andD.Saupe.Effectiveaestheticspredictionwithmulti-level spatially pooled features. In proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9375–9383, 2019. [10] E. J. Hu, Y. Shen, P. Wallis, Z. Allen-Zhu, Y. Li, S. Wang, L. Wang, and W. Chen. Lora: Low-rank adaptation of large language models. arXiv preprint arXiv:2106.09685, 2021. [11] A. Hussain, D. H. Ting, and M. Mazhar. Driving consumer value co-creation and purchase intention by social media advertising value. Frontiers in psychology, 13:800206, 2022. [12] J.Ke,Q.Wang,Y.Wang,P.Milanfar,andF.Yang.Musiq:Multi-scaleimagequality transformer. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 5148–5157, 2021. [13] A. Kirillov, E. Mintun, N. Ravi, H. Mao, C. Rolland, L. Gustafson, T. Xiao, S. Whitehead, A. Berg, W. Lo, and P. Dollár. Segment anything. arXiv preprint arXiv:2304.02643, 2023. [14] Y. Kirstain, A. Polyak, U. Singer, S. Matiana, J. Penna, and O. Levy. Pick-a-pic: An open dataset of user preferences for text-to-image generation. Advances in Neural Information Processing Systems, 36, 2024. [15] D.Li,A.Kamko,E.Akhgari,A.Sabet,L.Xu,andS.Doshi.Playgroundv2.5:Three insights towards enhancing aesthetic quality in text-to-image generation, 2024. [16] J. Li, D. Li, C. Xiong, and S. Hoi. Blip: Bootstrapping language-image pretraining for unified vision-language understanding and generation. In International conference on machine learning, pages 12888–12900. PMLR, 2022. [17] K.-Y. Lin and G. Wang. Hallucinated-iqa: No-reference image quality assessment via adversarial learning, 2018. [18] OpenAI, :, J. Achiam, S. Adler, S. Agarwal, L. Ahmad, I. Akkaya, F. L. Aleman, D. Almeida, J. Altenschmidt, S. Altman, S. Anadkat, R. Avila, I. Babuschkin, S. Balaji, V. Balcom, P. Baltescu, H. Bao, M. Bavarian, J. Belgum, I. Bello, J. Berdine, G. Bernadett-Shapiro, C. Berner, L. Bogdonoff, O. Boiko, M. Boyd, A.-L. Brakman, G. Brockman, T. Brooks, M. Brundage, K. Button, T. Cai, R. Campbell, A. Cann, B. Carey, C. Carlson, R. Carmichael, B. Chan, C. Chang, F. Chantzis, D. Chen, S. Chen, R. Chen, J. Chen, M. Chen, B. Chess, C. Cho, C. Chu, H. W. Chung, D. Cummings, J. Currier, Y. Dai, C. Decareaux, T. Degry, N. Deutsch, D. Deville, A. Dhar, D. Dohan, S. Dowling, S. Dunning, A. Ecoffet, A. Eleti, T. Eloundou, D. Farhi, L. Fedus, N. Felix, S. P. Fishman, J. Forte, I. Fulford, L. Gao, E. Georges, C. Gibson, V. Goel, T. Gogineni, G. Goh, R. Gontijo-Lopes, J. Gordon, M. Grafstein, S. Gray, R. Greene, J. Gross, S. S. Gu, Y. Guo, C. Hallacy, J. Han, J. Harris, Y. He, M. Heaton, J. Heidecke, C. Hesse, A. Hickey, W. Hickey, P. Hoeschele, B. Houghton, K. Hsu, S. Hu, X. Hu, J. Huizinga, S. Jain, S. Jain, J. Jang, A. Jiang, R. Jiang, H. Jin, D. Jin, S. Jomoto, B. Jonn, H. Jun, T. Kaftan, Łukasz Kaiser, A. Kamali, I. Kanitscheider, N. S. Keskar, T. Khan, L. Kilpatrick, J. W. Kim, C. Kim, Y. Kim, J. H. Kirchner, J. Kiros, M. Knight, D. Kokotajlo, Łukasz Kondraciuk, A. Kondrich, A. Konstantinidis, K. Kosic, G. Krueger, V. Kuo, M. Lampe, I. Lan, T. Lee, J. Leike, J. Leung, D. Levy, C. M. Li, R. Lim, M. Lin, S. Lin, M. Litwin, T. Lopez, R. Lowe, P. Lue, A. Makanju, K. Malfacini, S. Manning, T. Markov, Y. Markovski, B. Martin, K. Mayer, A. Mayne, B. McGrew, S. M. McKinney, C. McLeavey, P. McMillan, J. McNeil, D. Medina, A. Mehta, J. Menick, L. Metz, A. Mishchenko, P. Mishkin, V. Monaco, E. Morikawa, D. Mossing, T. Mu, M. Murati, O. Murk, D. Mély, A. Nair, R. Nakano, R. Nayak, A. Neelakantan, R. Ngo, H. Noh, L. Ouyang, C. O’Keefe, J. Pachocki, A. Paino, J. Palermo, A. Pantuliano, G. Parascandolo, J. Parish, E. Parparita, A. Passos, M. Pavlov, A. Peng, A. Perelman, F. de Avila Belbute Peres, M. Petrov, H. P. de Oliveira Pinto, Michael, Pokorny, M. Pokrass, V. H. Pong, T. Powell, A. Power, B. Power, E. Proehl, R. Puri, A. Radford, J. Rae, A. Ramesh, C. Raymond, F. Real, K. Rimbach, C. Ross, B. Rotsted, H. Roussez, N. Ryder, M. Saltarelli, T. Sanders, S. Santurkar, G. Sastry, H. Schmidt, D. Schnurr, J. Schulman, D. Selsam, K. Sheppard, T. Sherbakov, J. Shieh, S. Shoker, P. Shyam, S. Sidor, E. Sigler, M. Simens, J. Sitkin, K. Slama, I. Sohl, B. Sokolowsky, Y. Song, N. Staudacher, F. P. Such, N. Summers, I. Sutskever, J. Tang, N. Tezak, M. B. Thompson, P. Tillet, A. Tootoonchian, E. Tseng, P. Tuggle, N. Turley, J. Tworek, J. F. C. Uribe, A. Vallone, A. Vijayvergiya, C. Voss, C. Wainwright, J. J. Wang, A. Wang, B. Wang, J. Ward, J. Wei, C. Weinmann, A. Welihinda, P. Welinder, J. Weng, L. Weng, M. Wiethoff, D. Willner, C. Winter, S. Wolrich, H. Wong, L. Workman, S. Wu, J. Wu, M. Wu, K. Xiao, T. Xu, S. Yoo, K. Yu, Q. Yuan, W. Zaremba, R. Zellers, C. Zhang, M. Zhang, S. Zhao, T. Zheng, J. Zhuang, W. Zhuk, and B. Zoph. Gpt-4 technical report, 2024. [19] OpenAI. Hello gpt-4o. https://openai.com/index/hello-gpt-4o/, 2024. [20] A. Ramesh, M. Pavlov, G. Goh, S. Gray, C. Voss, A. Radford, M. Chen, and I. Sutskever. Zero-shot text-to-image generation, 2021. [21] R. Ranftl, A. Bochkovskiy, and V. Koltun. Vision transformers for dense prediction. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 12179–12188, 2021. [22] R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Ommer. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 10684–10695, 2022. [23] RunwayML. Stable diffusion inpainting. https://huggingface.co/runwayml/ stable-diffusion-inpainting, 2023. [24] A. Saha, S. Mishra, and A. C. Bovik. Re-iqa: Unsupervised learning for image quality assessment in the wild, 2023. [25] J. Snell, K. Ridgeway, R. Liao, B. Roads, M. Mozer, and R. Zemel. Learning to generate images with perceptual similarity metrics. In 2017 IEEE International Conference on Image Processing (ICIP), pages 4277–4281, 2017, September. [26] S. Suzuki et al. Topological structural analysis of digitized binary images by border following. Computer vision, graphics, and image processing, 30(1):32–46, 1985. [27] H. Talebi and P. Milanfar. Nima: Neural image assessment. IEEE Transactions on Image Processing, 27(8):3998–4011, Aug. 2018. [28] C. Tomasi. Histograms of oriented gradients. Computer Vision Sampler, pages 1–6, 2012. [29] V. V. Vinod and H. Murase. Focused color intersection with efficient searching for object extraction. Pattern recognition, 30(10):1787–1797, 1997. [30] J. Wang, K. C. K. Chan, and C. C. Loy. Exploring clip for assessing the look and feel of images, 2022. [31] Z. Wang, A. Bovik, H. Sheikh, and E. Simoncelli. Image quality assessment: from error visibility to structural similarity. IEEE transactions on image processing, 13(4):600–612, 2004. [32] X.Wu,Y.Hao,K.Sun,Y.Chen,F.Zhu,R.Zhao,andH.Li.Humanpreferencescore v2: A solid benchmark for evaluating human preferences of text-to-image synthesis. arXiv preprint arXiv:2306.09341, 2023. [33] H. Xu, S. Xie, X. Tan, P. Huang, R. Howes, V. Sharma, S. Li, G. Ghosh, L. Zettlemoyer, and C. Feichtenhofer. Demystifying clip data. arXiv preprint arXiv:2309.16671, 2023. [34] J. Xu, X. Liu, Y. Wu, Y. Tong, Q. Li, M. Ding, J. Tang, and Y. Dong. Imagereward: Learning and evaluating human preferences for text-to-image generation. Advances in Neural Information Processing Systems, 36, 2024. [35] R. Yi, H. Tian, Z. Gu, Y.-K. Lai, and P. L. Rosin. Towards artistic image aesthetics assessment: a large-scale dataset and a new method. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 22388– 22397, 2023. [36] J. Yu, Z. Wang, V. Vasudevan, L. Yeung, M. Seyedhosseini, and Y. Wu. Coca: Contrastive captioners are image-text foundation models. arXiv preprint arXiv:2205.01917, 2022. [37] J. Yuan, X. Cao, C. Li, F. Yang, J. Lin, and X. Cao. Pku-i2iqa: An image-to-image quality assessment database for ai generated images, 2023. [38] J. Yuan, X. Cao, C. Li, F. Yang, J. Lin, and X. Cao. Pku-i2iqa: An image-to-image quality assessment database for ai generated images. arXiv preprint arXiv:2311.15556, 2023. [39] B. Zhou, Y. Hu, X. Weng, J. Jia, J. Luo, X. Liu, J. Wu, and L. Huang. Tinyllava: A framework of small-scale large multimodal models. arXiv preprint arXiv:2402.14289, 2024. [40] J. Zhu and N. Wang. Image quality assessment by visual gradient similarity. IEEE Transactions on Image Processing, 21(3):919–933, 2011. | - |
| dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/93412 | - |
| dc.description.abstract | 本文介紹了一個名為 QAPBG (Quality Assessment for Product Background Generation) 的評估框架,專門用於評估產品背景生成結果的質量。雖然這個領域在工 業界有廣泛應用,但卻缺乏結構化的評估方法。此框架提供了一個寶貴的工具, 用於全面評估產品背景生成領域中的生成模型,其特點是包含六個不同的指標: 背景提示對齊、產品一致性、產品孔洞與背景一致性、合理的地面產品、自然陰 影和美學質量。每個指標對於提高生成圖像的真實感和效果至關重要,既考慮了 視覺吸引力又考慮了功能一致性。我們提出並嚴格評估了評分這些指標的方法, 從而建立了一種標準化的方法來評估和改進產品背景生成。 | zh_TW |
| dc.description.abstract | This paper introduces an evaluation framework, named QAPBG (Quality Assessment for Product Background Generation), tailored specifically for assessing the quality of outcomes in product background generation—a domain with widespread industry applications yet lacking structured evaluation methods. This framework serves as a valuable tool for the comprehensive evaluation of generative models in the product background generation domain, featuring six distinct metrics: background prompt alignment, product consistency, product hole and background consistency, reasonable grounding product, natural shadow, and aesthetic quality. Each metric is crucial for enhancing the realism and effectiveness of generated images, addressing both visual appeal and functional coherence. We propose and rigorously evaluate methods for scoring these metrics, thereby establishing a standardized approach to assess and improve product background generation. | en |
| dc.description.provenance | Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2024-07-31T16:12:02Z No. of bitstreams: 0 | en |
| dc.description.provenance | Made available in DSpace on 2024-07-31T16:12:02Z (GMT). No. of bitstreams: 0 | en |
| dc.description.tableofcontents | 口試委員會審定書 (i)
誌謝 (ii) 摘要 (iv) Abstract (v) Contents (vi) List of Figures (viii) List of Tables (xi) Chapter 1 Introduction (1) Chapter 2 Product Background Generation (10) Chapter 3 Related Work (14) Chapter 4 Quality Assessment (16) Chapter 4.1 Background and Prompt Alignment (17) Chapter 4.2 Product Consistency (19) Chapter 4.3 Product Hole Background Consistency (23) Chapter 4.4 Reasonable Grounding Product (26) Chapter 4.5 Natural Shadow (27) Chapter 4.6 Aesthetic (28) Chapter 5 Experiments (31) Chapter 5.1 Background and Prompt Alignment (31) Chapter 5.2 Product Consistency (33) Chapter 5.3 Product Hole Background Consistency (35) Chapter 5.4 Reasonable Grounding Product (36) Chapter 5.5 Natural Shadow (37) Chapter 5.6 Aesthetic (38) Chapter 6 Limitations (50) Chapter 7 Conclusion (51) References (52) | - |
| dc.language.iso | en | - |
| dc.subject | 人工智能生成內容 (Artificial Intelligence Generated Content) | zh_TW |
| dc.subject | 產品背景生成 | zh_TW |
| dc.subject | 品質評估框架 | zh_TW |
| dc.subject | 生成模型評估 | zh_TW |
| dc.subject | 圖像質量評估 | zh_TW |
| dc.subject | Product Background Generation | en |
| dc.subject | AIGC | en |
| dc.subject | Image Quality Assessment | en |
| dc.subject | Generative Models Evaluation | en |
| dc.subject | Quality Assessment Framework | en |
| dc.title | 產品背景生成的質量評估 | zh_TW |
| dc.title | Quality Assessment for Product Background Generation | en |
| dc.type | Thesis | - |
| dc.date.schoolyear | 112-2 | - |
| dc.description.degree | 碩士 | - |
| dc.contributor.oralexamcommittee | 廖弘源;莊永裕;陳祝嵩 | zh_TW |
| dc.contributor.oralexamcommittee | Hong-yuan Mark Liao;Yung-Yu Chuang;Chu-Song Chen | en |
| dc.subject.keyword | 人工智能生成內容 (Artificial Intelligence Generated Content),產品背景生成,品質評估框架,生成模型評估,圖像質量評估, | zh_TW |
| dc.subject.keyword | AIGC,Product Background Generation,Quality Assessment Framework,Generative Models Evaluation,Image Quality Assessment, | en |
| dc.relation.page | 58 | - |
| dc.identifier.doi | 10.6342/NTU202402357 | - |
| dc.rights.note | 未授權 | - |
| dc.date.accepted | 2024-07-29 | - |
| dc.contributor.author-college | 電機資訊學院 | - |
| dc.contributor.author-dept | 資訊工程學系 | - |
| 顯示於系所單位: | 資訊工程學系 | |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| ntu-112-2.pdf 未授權公開取用 | 17.52 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
