使用高解析度資料集於改善天空影像去背方法

蔡政諺; Cheng-Yen Tsai

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/92438

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	莊永裕	zh_TW
dc.contributor.advisor	Yung-Yu Chuang	en
dc.contributor.author	蔡政諺	zh_TW
dc.contributor.author	Cheng-Yen Tsai	en
dc.date.accessioned	2024-03-22T16:30:30Z	-
dc.date.available	2024-03-23	-
dc.date.copyright	2024-03-22	-
dc.date.issued	2024	-
dc.date.submitted	2024-01-29	-
dc.identifier.citation	[1] G. Chen, Y. Liu, J. Wang, J. Peng, Y. Hao, L. Chu, S. Tang, Z. Wu, Z. Chen, Z. Yu, et al. Pp-matting: High-accuracy natural image matting. arXiv preprint arXiv:2204.09433, 2022. [2] Y.-Y. Chuang, B. Curless, D. H. Salesin, and R. Szeliski. A bayesian approach to digital matting. In Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001, volume 2, pages II–II. IEEE, 2001. [3] M.Cordts,M.Omran,S.Ramos,T.Rehfeld,M.Enzweiler,R.Benenson,U.Franke, S. Roth, and B. Schiele. The cityscapes dataset for semantic urban scene understanding. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3213–3223, 2016. [4] M. Forte and F. Pitié. f , b, alpha matting. arXiv preprint arXiv:2003.07711, 2020. [5] K. He and J. Sun. Fast guided filter. arXiv preprint arXiv:1505.00996, 2015. [6] K. He, J. Sun, and X. Tang. Guided image filtering. IEEE transactions on pattern analysis and machine intelligence, 35(6):1397–1409, 2012. [7] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016. [8] G. Huang, Z. Liu, L. Van Der Maaten, and K. Q. Weinberger. Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4700–4708, 2017. [9] Z. Ke, J. Sun, K. Li, Q. Yan, and R. W. Lau. Modnet: Real-time trimap-free portrait matting via objective decomposition. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, pages 1140–1147, 2022. [10] A. Levin, D. Lischinski, and Y. Weiss. A closed-form solution to natural image matting. IEEE transactions on pattern analysis and machine intelligence, 30(2):228– 242, 2007. [11] J.Li,J.Zhang,S.J.Maybank,andD.Tao.Bridgingcompositeandreal:towardsend- to-end deep image matting. International Journal of Computer Vision, 130(2):246– 266, 2022. [12] O. Liba, L. Cai, Y.-T. Tsai, E. Eban, Y. Movshovitz-Attias, Y. Pritch, H. Chen, and J. T. Barron. Sky optimization: Semantically aware image processing of skies in low-light photography. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pages 526–527, 2020. [13] R.Liu,J.Lehman,P.Molino,F.PetroskiSuch,E.Frank,A.Sergeev,andJ.Yosinski. An intriguing failing of convolutional neural networks and the coordconv solution. Advances in neural information processing systems, 31, 2018. [14] Z. Liu, Y. Lin, Y. Cao, H. Hu, Y. Wei, Z. Zhang, S. Lin, and B. Guo. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF international conference on computer vision, pages 10012–10022, 2021. [15] J. Long, E. Shelhamer, and T. Darrell. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3431–3440, 2015. [16] R.P.Mihail,S.Workman,Z.Bessinger,andN.Jacobs.Skysegmentationinthewild: An empirical study. In 2016 IEEE Winter Conference on Applications of Computer Vision (WACV), pages 1–6. IEEE, 2016. [17] G. Park, S. Son, J. Yoo, S. Kim, and N. Kwak. Matteformer: Transformer-based image matting via prior-tokens. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11696–11706, 2022. [18] C. Rhemann, C. Rother, J. Wang, M. Gelautz, P. Kohli, and P. Rott. A perceptually motivated online benchmark for image matting. In 2009 IEEE conference on computer vision and pattern recognition, pages 1826–1833. IEEE, 2009. [19] M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L.-C. Chen. Mobilenetv2: Inverted residuals and linear bottlenecks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4510–4520, 2018. [20] T. Takikawa, D. Acuna, V. Jampani, and S. Fidler. Gated-scnn: Gated shape cnns for semantic segmentation. In Proceedings of the IEEE/CVF international conference on computer vision, pages 5229–5238, 2019. [21] Y.-H. Tsai, X. Shen, Z. Lin, K. Sunkavalli, and M.-H. Yang. Sky is not the limit: Semantic-aware sky replacement. ACM Transactions on Graphics (TOG), 35(4):1– 11, 2016. [22] J. Wang and M. F. Cohen. Optimized color sampling for robust matting. In 2007 IEEE Conference on Computer Vision and Pattern Recognition, pages 1–8. IEEE, 2007. [23] J. Wang, K. Sun, T. Cheng, B. Jiang, C. Deng, Y. Zhao, D. Liu, Y. Mu, M. Tan, X. Wang, et al. Deep high-resolution representation learning for visual recognition. IEEE transactions on pattern analysis and machine intelligence, 43(10):3349–3364, 2020. [24] X. Wang, Q. Lv, G. Chen, J. Zhang, Z. Wei, J. Dong, H. Fu, Z. Zhu, J. Liu, and X. Jin. Mobilesky: Real-time sky replacement for mobile ar. IEEE Transactions on Visualization and Computer Graphics, 2023. [25] H. Wu, S. Zheng, J. Zhang, and K. Huang. Fast end-to-end trainable guided filter. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 1838–1847, 2018. [26] N. Xu, B. Price, S. Cohen, and T. Huang. Deep image matting. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2970–2979, 2017. [27] Y. Zhang, L. Gong, L. Fan, P. Ren, Q. Huang, H. Bao, and W. Xu. A late fusion cnn for digital matting. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 7469–7478, 2019. [28] H. Zhao, J. Shi, X. Qi, X. Wang, and J. Jia. Pyramid scene parsing network. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2881–2890, 2017. [29] B. Zhou, H. Zhao, X. Puig, S. Fidler, A. Barriuso, and A. Torralba. Scene parsing through ade20k dataset. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 633–641, 2017. [30] Z. Zou, R. Zhao, T. Shi, S. Qiu, and Z. Shi. Castle in the sky: Dynamic sky replacement and harmonization in videos. IEEE Transactions on Image Processing, 31:5067–5078, 2022.	-
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/92438	-
dc.description.abstract	這篇論文提出了一個改善的天空影像去背方法。我們提出了一個高解析度的天空影像去背資料集，透過人工標註三分圖 (trimap) 並導入基於三分圖的影像去背網路，生成精確的天空透明遮罩 (alpha matte)。接著，我們修改了 PP-Matting，一種通用的影像去背網路，使其在天空影像去背上有更好的表現。具體來說，我們將 CoordConv 層和可訓練的導向濾波器整合到網路中，並進行初步研究以找到最佳設計。實驗結果展示，相較於現有的天空影像去背資料集，我們的資料集為各種影像去背網路提供了更好的訓練環境，無論是在量化指標還是視覺品質上。此外，我們修改過的網路架構在天空影像去背上，相較於現有方法具有更優秀的表現。	zh_TW
dc.description.abstract	This paper proposes an improved method for sky image matting. We present a high-resolution sky image matting dataset and generate accurate alpha mattes by leveraging a trimap-based image matting network with manually annotated trimaps. Next, we modify the architecture of PP-Matting, a general-purpose image matting network, to better suit the task of sky image matting. Specifically, we integrate CoordConv layers and a trainable guided filter into the network and conduct a preliminary study to find the optimal design. Experimental results demonstrate that our dataset provides a better training environment for various image matting networks compared to an existing sky image matting dataset in both quantitative metrics and visual quality. Additionally, our modified network shows superior performance compared to existing methods.	en
dc.description.provenance	Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2024-03-22T16:30:30Z No. of bitstreams: 0	en
dc.description.provenance	Made available in DSpace on 2024-03-22T16:30:30Z (GMT). No. of bitstreams: 0	en
dc.description.tableofcontents	Verification Letter from the Oral Examination Committee i Acknowledgements ii 摘要 iii Abstract iv Contents v List of Figures vii List of Tables viii Chapter 1 Introduction 1 1.1 Motivation 1 1.2 Background 1 1.3 Contribution 3 Chapter 2 Related Works 4 2.1 Image Matting 4 2.2 Sky Replacement 6 2.3 Sky Matting Dataset 8 Chapter 3 Proposed Dataset 11 3.1 Image Properties 11 3.2 Ground Truth 13 Chapter 4 Method 14 4.1 PP-Matting 15 4.2 Coordinate Convolutional Layer 16 4.3 Trainable Guided Filter 17 4.4 Loss Function 18 4.5 Preliminary Study 19 Chapter 5 Experimental Results 21 5.1 Implementation Details 21 5.2 Evaluation Metrics 22 5.3 Quantitative Comparison 23 5.4 Visual Comparison 23 Chapter 6 Conclusion 26 References 27	-
dc.language.iso	en	-
dc.title	使用高解析度資料集於改善天空影像去背方法	zh_TW
dc.title	An Improved Sky Image Matting Method with a High-Resolution Dataset	en
dc.type	Thesis	-
dc.date.schoolyear	112-1	-
dc.description.degree	碩士	-
dc.contributor.oralexamcommittee	吳賦哲;葉正聖	zh_TW
dc.contributor.oralexamcommittee	Fu-Che Wu;Jeng-Sheng Yeh	en
dc.subject.keyword	影像去背,電腦視覺,深度學習,	zh_TW
dc.subject.keyword	Image Matting,Computer Vision,Deep Learning,	en
dc.relation.page	31	-
dc.identifier.doi	10.6342/NTU202400108	-
dc.rights.note	未授權	-
dc.date.accepted	2024-01-31	-
dc.contributor.author-college	電機資訊學院	-
dc.contributor.author-dept	資訊工程學系	-
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
ntu-112-1.pdf 目前未授權公開取用	42.61 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。