基於深度學習之具多樣性色彩映射演算法

Chien-Chuan Su; 蘇建銓

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/60505

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	貝蘇章(Soo-Chang Pei)
dc.contributor.author	Chien-Chuan Su	en
dc.contributor.author	蘇建銓	zh_TW
dc.date.accessioned	2021-06-16T10:20:01Z	-
dc.date.available	2025-07-05
dc.date.copyright	2020-07-15
dc.date.issued	2020
dc.date.submitted	2020-07-06
dc.identifier.citation	[1] Fairchild HDR dataset. http://rit-mcsl.org/fairchild//HDR.html. [2] HDREYE.http://mmspg.epfl.ch/hdr-eye. [3] HDRheaven. https://hdrihaven.com/. [4] HDRlab. http://www.hdrlabs.com/sibl/archive.html. [5] High Dynamic Range Specific Image Dataset (HDRSID). http://facultymembers. sbu.ac.ir/moghaddam/index.php/main/page/10. [6] A. Adams. The Print (Ansel Adams Photography), volume 4 of 10. Little, Brown, The address, 3 edition, 6 1995. An optional note. [7] W. Adams, J. Elder, E. Graf, J. Leyland, A. Lugitgheid, and A. Muryy. The southamptonyork natural scenes (syns) dataset: Statistics of surface attitude. Scientific Reports, 6, 10 2016. [8] M. Arjovsky, S. Chintala, and L. Bottou. Wasserstein gan, 2017. [9] M. Ashikhmin. A tone mapping algorithm for high contrast images. pages 145–156, 01 2002. [10] H. Bahng, S. Yoo, W. Cho, D. K. Park, Z. Wu, X. Ma, and J. Choo. Coloring with words: Guiding image colorization through textbased palette generation, 2018. [11] J. Bao, D. Chen, F. Wen, H. Li, and G. Hua. CVAEGAN: finegrained image generation through asymmetric training. CoRR, [12] D. Berthelot, T. Schumm, and L. Metz. Began: Boundary equilibrium generative adversarial networks, 2017. [13] Bo Gu, Wujing Li, Minyun Zhu, and Minghui Wang. Local EdgePreserving Multiscale Decomposition for High Dynamic Range Image Tone Mapping. IEEE Transactions on Image Processing, 22(1):70–79, Jan. 2013. [14] A. Brock, J. Donahue, and K. Simonyan. Large scale GAN training for high fidelity natural image synthesis. CoRR, abs/1809.11096, 2018. [15] X. Chen, Y. Duan, R. Houthooft, J. Schulman, I. Sutskever, and P. Abbeel. Infogan:Interpretable representation learning by information maximizing generative adversarial nets. CoRR, abs/1606.03657, 2016. [16] P. Debevec and J. Malik. Recovering high dynamic range radiance maps from photographs. pages 369–378, 1997. [17] J. Donahue, P. Krähenbühl, and T. Darrell. Adversarial feature learning. CoRR, abs/1605.09782, 2016. [18] F. Drago, K. Myszkowski, T. Annen, N. Chiba, P. Brunet, and D. Fellner. Adaptive logarithmic mapping for displaying high contrast scenes. EUROGRAPHICS 2003 (EUROGRAPHICS03): the European Association for Computer Graphics, 24th Annual Conference, Blackwell, 419426 (2003), 22, 09 2003. [19] V. Dumoulin, I. Belghazi, B. Poole, A. Lamb, M. Arjovsky, O. Mastropietro, and A. C. Courville. Adversarially learned inference. ArXiv, abs/1606.00704, 2016. [20] F. Durand and J. Dorsey. Fast bilateral filtering for the display of highdynamicrange images. ACM Transactions on Graphics, 21(3), July 2002. [21] G. Eilertsen, R. K. Mantiuk, and J. Unger. A comparative review of tonemapping algorithms for high dynamic range video. Comput. Graph. Forum, 36(2):565–592, May 2017. [22] Z. Farbman, R. Fattal, D. Lischinski, and R. Szeliski. Edgepreserving decompositions for multiscale tone and detail manipulation. ACM Trans. Graph., 27, 08 2008. [23] R. Fattal, D. Lischinski, and M. Werman. Gradient domain high dynamic range compression. ACM Transactions on Graphics, 21(3), July 2002. [24] S. Ferradans, M. Bertalmio, E. Provenzi, and V. Caselles. An analysis of visual adaptation and contrast perception for tone mapping. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(10):2002–2012, Oct 2011. [25] J. A. Ferwerda, S. N. Pattanaik, P. Shirley, and D. P. Greenberg. A model of visual adaptation for realistic image synthesis. In Proceedings of the 23rd Annual Conference on Computer Graphics and Interactive Techniques, SIGGRAPH ’96, pages 249–258, New York, NY, USA, 1996. ACM. [26] M.A.Gardner, K. Sunkavalli, E. Yumer, X. Shen, E. Gambaretto, C. Gagné, and J.F. Lalonde. Learning to predict indoor illumination from a single image. ACM Transactions on Graphics, 36, 03 2017. [27] X. Glorot and Y. Bengio. Understanding the difficulty of training deep feedforward neural networks. In Y. W. Teh and M. Titterington, editors, Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, volume 9 of Proceedings of Machine Learning Research, pages 249–256, Chia Laguna Resort,Sardinia, Italy, 13–15 May 2010. PMLR. [28] I. J. Goodfellow, J. PougetAbadie,M. Mirza, B. Xu, D. WardeFarley,S. Ozair,A. Courville, and Y. Bengio. Generative adversarial nets. In Proceedings of the 27th International Conference on Neural Information Processing Systems Volume 2, NIPS’14, pages 2672–2680, Cambridge, MA, USA, 2014. MIT Press. [29] I. J. Goodfellow, J. PougetAbadie,M. Mirza, B. Xu, D. WardeFarley,S. Ozair,A. Courville, and Y. Bengio. Generative adversarial networks, 2014. [30] I. Gulrajani, F. Ahmed, M. Arjovsky, V. Dumoulin, and A. Courville. Improved training of wasserstein gans, 2017. [31] J. Hateren. Encoding of high dynamic range video with a model of human cones. ACM Trans. Graph., 25:1380–1399, 10 2006. [32] M. He, D. Chen, J. Liao, P. V. Sander, and L. Yuan. Deep exemplarbased colorization,2018. [33] P. Isola, J. Zhu, T. Zhou, and A. A. Efros. Imagetoimage translation with conditional adversarial networks. CoRR, abs/1611.07004, 2016. [34] T. Jinno and M. Okuda. Multiple exposure fusion for high dynamic range image acquisition. IEEE Transactions on Image Processing, 21(1):358–365, Jan 2012. [35] M. H. Kim and J. Kautz. Consistent tone reproduction. In Proc. the Tenth IASTED International Conference on Computer Graphics and Imaging (CGIM 2008), pages 152–159, Innsbruck, Austria, 2008. IASTED/ACTA Press. [36] D. Kingma and J. Ba. Adam: A method for stochastic optimization. International Conference on Learning Representations, 12 2014. [37] N. Kodali, J. Abernethy, J. Hays, and Z. Kira. On convergence and stability of gans,2017. [38] Z. Liang, J. Xu, D. Zhang, Z. Cao, and L. Zhang. A Hybrid l1l0 Layer Decomposition Model for Tone Mapping. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4758–4766, Salt Lake City, UT, USA, June 2018.IEEE. [39] J. H. Lim and J. C. Ye. Geometric gan, 2017. [40] D. Lischinski, Z. Farbman, M. Uyttendaele, and R. Szeliski. Interactive local adjustment of tonal values. ACM Trans. Graph., 25:646–653, 07 2006. [41] Z. Mai, H. Mansour, R. Mantiuk, P. Nasiopoulos, R. Ward, and W. Heidrich. Optimizing a tone curve for backwardcompatible high dynamic range image and video compression. IEEE Transactions on Image Processing, 20(6):1558–1571, June 2011. [42] R. Mantiuk, S. Daly, and L. Kerofsky. Display adaptive tone mapping. Turk, Greg: Proceedings of ACM SIGGRAPH 2008, ACM, Art.68.110(2008), 27, 08 2008. [43] R. Mantiuk, K. Myszkowski, and H.P.Seidel. A perceptual framework for contrast processing of high dynamic range images. ACM Trans. Appl. Percept., 3(3):286–308, July 2006. [44] X. Mao, Q. Li, H. Xie, R. Y. K. Lau, Z. Wang, and S. P. Smolley. Least squares generative adversarial networks, 2016. [45] B. Mildenhall, J. T. Barron, J. Chen, D. Sharlet, R. Ng, and R. Carroll. Burst denoising with kernel prediction networks. CoRR, abs/1712.02327, 2017. [46] M. Mirza and S. Osindero. Conditional generative adversarial nets. CoRR, abs/1411.1784, 2014. [47] T. Miyato, T. Kataoka, M. Koyama, and Y. Yoshida. Spectral normalization for generative adversarial networks, 2018. [48] T. Miyato, T. Kataoka, M. Koyama, and Y. Yoshida. Spectral normalization for generative adversarial networks. CoRR, abs/1802.05957, 2018. [49] V. Nagarajan and J. Kolter. Gradient descent gan optimization is locally stable. 062017. [50] H. Nemoto, P. Korshunov, P. Hanhart, and T. Ebrahimi. Visual attention in ldr and hdr images. 2015. [51] V. Patel, P. Shah, and S. Raman. A generative adversarial network for tone mapping hdr images. 10 2017. [52] S. Pattanaik, J. Tumblin, H. Yee, and D. Greenberg. Timedependent visual adaptation for fast realistic image display. Proceedings of the ACM SIGGRAPH Conference on Computer Graphics, 05 2000. [53] A. Radford, L. Metz, and S. Chintala. Unsupervised representation learning with deep convolutional generative adversarial networks, 2015. [54] A. Rana, P. Singh, G. Valenzise, F. Dufaux, N. Komodakis, and A. Smolic. Deep Tone Mapping Operator for High Dynamic Range Images. IEEE Transactions on Image Processing, pages 1–1, 2019. [55] E. Reinhard and K. Devlin. Dynamic range reduction inspired by photoreceptor physiology. IEEE Transactions on Visualization and Computer Graphics, 11(1):13–24, Jan 2005. [56] E. Reinhard, M. Stark, P. Shirley, and J. Ferwerda. Photographic tone reproduction for digital images. ACM Transactions on Graphics, 21(3), July 2002. [57] T. Salimans, I. Goodfellow, W. Zaremba, V. Cheung, A. Radford, and X. Chen. Improved techniques for training gans. 06 2016. [58] K. Simonyan and A. Zisserman. Very deep convolutional networks for largescale image recognition, 2014. [59] T.H.Sun, C.H.Lai, S.K.Wong, and Y.S.Wang. Adversarial colorization of icons based on structure and color conditions, 2019. [60] J. Tumblin and H. Rushmeier. Tone reproduction for realistic images. IEEE Comput.Graph. Appl., 13(6):42–48, Nov. 1993. [61] J. Tumblin and G. Turk. Lcis: A boundary hierarchy for detailpreserving contrast reduction. In Proceedings of the 26th Annual Conference on Computer Graphics and Interactive Techniques, SIGGRAPH ’99, pages 83–90, New York, NY, USA, 1999. ACM Press/AddisonWesley Publishing Co. [62] P. Vitoria, L. Raad, and C. Ballester. Chromagan: Adversarial picture colorization with semantic class distribution, 2019. [63] G. Ward. Graphics gems iv. chapter A Contrastbased Scalefactor for Luminance Display, pages 415–421. Academic Press Professional, Inc., San Diego, CA, USA, 1994. [64] D. Yang, S. Hong, Y. Jang, T. Zhao, and H. Lee. Diversitysensitive conditional generative adversarial networks. CoRR, abs/1901.09024, 2019. [65] X. Yang, K. Xu, Y. Song, Q. Zhang, X. Wei, and R. W. H. Lau. Image correction via deep reciprocating HDR transformation. CoRR, abs/1804.04371, 2018. [66] H. Yeganeh and Z. Wang. Objective quality assessment of tonemapped images. Trans. Img. Proc., 22(2):657–667, Feb. 2013. [67] S. Yoo, H. Bahng, S. Chung, J. Lee, J. Chang, and J. Choo. Coloring with limited data: Fewshot colorization via memoryaugmented networks, 2019. [68] R. Zhang, P. Isola, and A. A. Efros. Colorful image colorization, 2016. [69] R. Zhang, J.Y. Zhu, P. Isola, X. Geng, A. S. Lin, T. Yu, and A. A. Efros. Realtime userguided image colorization with learned deep priors, 2017. [70] J.Y.Zhu, T. Park, P. Isola, and A. A. Efros. Unpaired imagetoimage translation using cycleconsistent adversarial networks, 2017. [71] J.Y.Zhu, R. Zhang, D. Pathak, T. Darrell, A. A. Efros, O. Wang, and E.Shechtman. Toward multimodal toimage translation. In I. Guyon, U. V. Luxburg, S. Bengio,H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, editors, Advances in Neural Information Processing Systems 30, pages 465–476. Curran Associates, Inc., 2017.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/60505	-
dc.description.abstract	色彩映射在高動態範圍成像(High Dynamic Range Imaging)中扮演著一個重要角色，其用於在保留視覺特徵與美觀的特性下壓縮一影像由高動態範圍(High Dynamic Range,HDR)至低動態範圍(Low Dynamic Range,LDR)以便於呈現在顯示器之上。過去雖然有許多優秀的色彩映射演算法，但其往往只能呈現一個特定預先設計的風格且於不同場景合適的演算法會有所不同，而且，演算法的優缺評價是很主觀的，也會隨著不同的人而變化，因此，本論文提出一使用深度學習(Deep Learning)且基於傳統架構的色彩映射(Tone Mapping)演算法，並使用BicycleGAN的訓練架構，令此演算法具有生成不同風格的特性，使用者僅需更改隱性分類碼(Latent code)，即可簡易的取得個人喜好的結果。建立在傳統方法的生成器(Generator)架構幫助我們減少生成對抗網絡(Generative Adversarial Network,GAN訓練中的不確定性，產生美觀、無瑕疵(artifact)的輸出。最後，我們對此方法進行檢測，並且在主觀與客觀的品質指標得到十分優秀的成績，皆優於目前存在的傳統或深度學習色彩映射演算法。	zh_TW
dc.description.abstract	Tone-mapping plays an essential role in high dynamic range (HDR) imaging.It aims to preserve visual information of HDR images in a medium with a limited dynamic range.Although many works have been proposed to provide tone-mapped results from HDR images, most of them can only perform tone-mapping in a single pre-designed way.However, the subjectivity of tone-mapping quality varies from person to person, and the preference of tone-mapping style also differs from application to application.In this paper, a learning-based multimodal tone-mapping method is proposed, which not only achieves excellent visual quality but also enables easy style adjustability. Based on the framework of an improved cVAE-GAN, the proposed method can provide a variety of expert-level tone-mapping results by manipulating different latent codes. Moreover, the proposed method is fast and of minimal artifacts among both learning based and non-learning based methods.The tone-mapped visual quality also outperforms the stat-of-the-art tone-mapping algorithms quantitatively and qualitatively.	en
dc.description.provenance	Made available in DSpace on 2021-06-16T10:20:01Z (GMT). No. of bitstreams: 1 U0001-0507202018014600.pdf: 6235189 bytes, checksum: da97586227b2d1ea36542e3011de0536 (MD5) Previous issue date: 2020	en
dc.description.tableofcontents	口試委員會審定書iii 誌謝v Acknowledgements vii 摘要ix Abstract xi 1 Introduction 1 2 Backgrounds 5 2.1 Classic Tone Mapping Algorithms . . . . . . . . . . . . . . . . . . . . . 5 2.1.1 Bilateral method . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.1.2 Photographic method . . . . . . . . . . . . . . . . . . . . . . . . 5 2.1.3 Gradient compression method . . . . . . . . . . . . . . . . . . . 7 2.2 Generative Adversarial Network . . . . . . . . . . . . . . . . . . . . . . 8 2.3 Image-to-Image Translation . . . . . . . . . . . . . . . . . . . . . . . . 9 2.3.1 Pix2Pix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.3.2 CycleGAN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.3.3 Example: Colorization . . . . . . . . . . . . . . . . . . . . . . . 11 2.4 Multimodal Image-to-Image Translation . . . . . . . . . . . . . . . . . . 13 2.4.1 BicycleGAN . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.4.2 DSGAN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.5 Related Works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.5.1 Conventional Tone Mapping Methods . . . . . . . . . . . . . . . 16 2.5.2 Learningbased Tone Mapping Methods . . . . . . . . . . . . . . 16 3 Deep Diverse Tone Mapping 19 3.1 Problem formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 3.1.1 Bilateral Filtering for Tone Mapping . . . . . . . . . . . . . . . . 19 3.1.2 Learningbased Bilateral Filtering . . . . . . . . . . . . . . . . . 20 3.2 Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.2.1 Generative model pipeline. . . . . . . . . . . . . . . . . . . . . . 21 3.2.2 Multimodal generative model . . . . . . . . . . . . . . . . . . . 24 3.2.3 Loss Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 3.2.4 Training Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 3.2.5 Implementation detail . . . . . . . . . . . . . . . . . . . . . . . 26 3.3 Latent Code Selection Strategy . . . . . . . . . . . . . . . . . . . . . . . 26 3.4 Architecture of Models . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 4 Experiment Results 29 4.1 Qualitative Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 4.2 Diversity of outputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 4.3 Quantitative Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . 30 4.4 User Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 4.5 Running Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 5 Conclusion 35 6 Appendix 37 6.1 Network Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 6.2 User study material . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 Bibliography 45
dc.language.iso	en
dc.subject	計算機圖形學	zh_TW
dc.subject	色彩映射	zh_TW
dc.subject	生成對抗網路	zh_TW
dc.subject	Generative Adversarial Network	en
dc.subject	Computational Photography	en
dc.subject	Tone Mapping	en
dc.title	基於深度學習之具多樣性色彩映射演算法	zh_TW
dc.title	Deep Learning-based Diverse Tone Mapping Algorithm	en
dc.type	Thesis
dc.date.schoolyear	108-2
dc.description.degree	碩士
dc.contributor.oralexamcommittee	杭學鳴(Hsueh-Ming Hang),丁建鈞(Jian-Jiun Ding),黃文良(Wen-Liang Hwang),鐘國亮(Kuo-Liang Chung)
dc.subject.keyword	色彩映射,計算機圖形學,生成對抗網路,	zh_TW
dc.subject.keyword	Tone Mapping,Computational Photography,Generative Adversarial Network,	en
dc.relation.page	51
dc.identifier.doi	10.6342/NTU202001322
dc.rights.note	有償授權
dc.date.accepted	2020-07-07
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	電機工程學研究所	zh_TW
顯示於系所單位：	電機工程學系

文件中的檔案：

檔案	大小	格式
U0001-0507202018014600.pdf 未授權公開取用	6.09 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。