請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/21323
完整後設資料紀錄
DC 欄位 | 值 | 語言 |
---|---|---|
dc.contributor.advisor | 莊永裕(Yung-Yu Chuang) | |
dc.contributor.author | Hsin-Yu Chang | en |
dc.contributor.author | 張心愉 | zh_TW |
dc.date.accessioned | 2021-06-08T03:31:06Z | - |
dc.date.copyright | 2019-08-19 | |
dc.date.issued | 2019 | |
dc.date.submitted | 2019-08-13 | |
dc.identifier.citation | [1] Y. Alharbi, N. Smith, and P. Wonka. Latent filter scaling for multimodal unsuper- vised image-to-image translation. In Proceedings of the IEEE Conference on Com- puter Vision and Pattern Recognition, pages 1458–1466, 2019.
[2] A. Brock, J. Donahue, and K. Simonyan. Large scale gan training for high fidelity natural image synthesis. arXiv preprint arXiv:1809.11096, 2018. [3] C. Chen, Q. Chen, J. Xu, and V. Koltun. Learning to see in the dark. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 3291– 3300, 2018. [4] Y.-S. Chen, Y.-C. Wang, M.-H. Kao, and Y.-Y. Chuang. Deep photo enhancer: Un- paired learning for image enhancement from photographs with gans. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 6306– 6314, 2018. [5] W. Cho, S. Choi, D. Keetae Park, I. Shin, and J. Choo. Image-to-image translation via group-wise deep whitening-and-coloring transformation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 10639–10647, 2019. [6] Y. Choi, M. Choi, M. Kim, J.-W. Ha, S. Kim, and J. Choo. Stargan: Unified genera- tive adversarial networks for multi-domain image-to-image translation. In Proceed- ings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 8789–8797, 2018. [7] L. A. Gatys, A. S. Ecker, and M. Bethge. Image style transfer using convolutional neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2414–2423, 2016. [8] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio. Generative adversarial nets. In Advances in neural information processing systems, pages 2672–2680, 2014. [9] K.He,X.Zhang,S.Ren,andJ.Sun.Delvingdeepintorectifiers:Surpassinghuman- level performance on imagenet classification. In Proceedings of the IEEE interna- tional conference on computer vision, pages 1026–1034, 2015. [10] M. Heusel, H. Ramsauer, T. Unterthiner, B. Nessler, and S. Hochreiter. Gans trained by a two time-scale update rule converge to a local nash equilibrium. In Advances in Neural Information Processing Systems, pages 6626–6637, 2017. [11] X. Hu, H. Mu, X. Zhang, Z. Wang, J. Sun, and T. Tan. Meta-sr: A magnification- arbitrary network for super-resolution. arXiv preprint arXiv:1903.00875, 2019. [12] X.HuangandS.Belongie.Arbitrarystyletransferinreal-timewithadaptiveinstance normalization. In Proceedings of the IEEE International Conference on Computer Vision, pages 1501–1510, 2017. [13] X.Huang,M.-Y.Liu,S.Belongie,andJ.Kautz.Multimodalunsupervisedimage-to- image translation. In Proceedings of the European Conference on Computer Vision (ECCV), pages 172–189, 2018. [14] P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros. Image-to-image translation with con- ditional adversarial networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1125–1134, 2017. [15] P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros. Image-to-image translation with con- ditional adversarial networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1125–1134, 2017. [16] J. Johnson, A. Alahi, and L. Fei-Fei. Perceptual losses for real-time style transfer and super-resolution. In European conference on computer vision, pages 694–711. Springer, 2016. [17] T. Karras, T. Aila, S. Laine, and J. Lehtinen. Progressive growing of gans for im- proved quality, stability, and variation. arXiv preprint arXiv:1710.10196, 2017. [18] T. Karras, S. Laine, and T. Aila. A style-based generator architecture for generative adversarial networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 4401–4410, 2019. [19] D. P. Kingma and M. Welling. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114, 2013. [20] H.-Y. Lee, H.-Y. Tseng, J.-B. Huang, M. Singh, and M.-H. Yang. Diverse image-to- image translation via disentangled representations. In Proceedings of the European Conference on Computer Vision (ECCV), pages 35–51, 2018. [21] X. Li, J. Hu, S. Zhang, X. Hong, Q. Ye, C. Wu, and R. Ji. Attribute guided un- paired image-to-image translation with semi-supervised learning. arXiv preprint arXiv:1904.12428, 2019. [22] X. Li, S. Liu, J. Kautz, and M.-H. Yang. Learning linear transformations for fast arbitrary style transfer. arXiv preprint arXiv:1808.04537, 2018. [23] Y. Li, C. Fang, J. Yang, Z. Wang, X. Lu, and M.-H. Yang. Universal style transfer via feature transforms. In Advances in neural information processing systems, pages 386–396, 2017. [24] M.-Y. Liu, T. Breuel, and J. Kautz. Unsupervised image-to-image translation net- works. In Advances in Neural Information Processing Systems, pages 700–708, 2017. [25] Q. Mao, H.-Y. Lee, H.-Y. Tseng, S. Ma, and M.-H. Yang. Mode seeking genera- tive adversarial networks for diverse image synthesis. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 1429–1437, 2019. [26] X. Mao, Q. Li, H. Xie, R. Y. Lau, Z. Wang, and S. Paul Smolley. Least squares gen- erative adversarial networks. In Proceedings of the IEEE International Conference on Computer Vision, pages 2794–2802, 2017. [27] R. Mechrez, I. Talmi, and L. Zelnik-Manor. The contextual loss for image trans- formation with non-aligned data. In Proceedings of the European Conference on Computer Vision (ECCV), pages 768–783, 2018. [28] Y. A. Mejjati, C. Richardt, J. Tompkin, D. Cosker, and K. I. Kim. Unsupervised attention-guided image-to-image translation. In Advances in Neural Information Processing Systems, pages 3693–3703, 2018. [29] T. Miyato, T. Kataoka, M. Koyama, and Y. Yoshida. Spectral normalization for generative adversarial networks. arXiv preprint arXiv:1802.05957, 2018. [30] S. Mo, M. Cho, and J. Shin. Instagan: Instance-aware image-to-image translation. arXiv preprint arXiv:1812.10889, 2018. [31] A. Pumarola, A. Agudo, A. M. Martinez, A. Sanfeliu, and F. Moreno-Noguer. Gan- imation: Anatomically-aware facial animation from a single image. In Proceedings of the European Conference on Computer Vision (ECCV), pages 818–833, 2018. [32] S. Reed, Z. Akata, X. Yan, L. Logeswaran, B. Schiele, and H. Lee. Generative adversarial text to image synthesis. In ICML, 2016. [33] X.Wang,K.Yu,S.Wu,J.Gu,Y.Liu,C.Dong,Y.Qiao,andC.ChangeLoy.Esrgan: Enhanced super-resolution generative adversarial networks. In Proceedings of the European Conference on Computer Vision (ECCV), pages 0–0, 2018. [34] W. Wu, K. Cao, C. Li, C. Qian, and C. C. Loy. Transgaga: Geometry-aware unsu- pervised image-to-image translation. arXiv preprint arXiv:1904.09571, 2019. [35] T. Xiao, J. Hong, and J. Ma. Elegant: Exchanging latent encodings with gan for transferring multiple face attributes. In Proceedings of the European Conference on Computer Vision (ECCV), pages 168–184, 2018. [36] Z. Yi, H. Zhang, P. Tan, and M. Gong. Dualgan: Unsupervised dual learning for image-to-image translation. In Proceedings of the IEEE international conference on computer vision, pages 2849–2857, 2017. [37] Y. Zhang, Y. Tian, Y. Kong, B. Zhong, and Y. Fu. Residual dense network for image super-resolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2472–2481, 2018. [38] Z. Zheng, Z. Yu, H. Zheng, Y. Wu, B. Zheng, and P. Lin. Generative adversarial net- work with multi-branch discriminator for cross-species image-to-image translation. arXiv preprint arXiv:1901.10895, 2019. [39] J.-Y.Zhu,T.Park,P.Isola,andA.A.Efros.Unpairedimage-to-imagetranslationus- ing cycle-consistent adversarial networks. In Proceedings of the IEEE international conference on computer vision, pages 2223–2232, 2017. [40] J.-Y.Zhu,R.Zhang,D.Pathak,T.Darrell,A.A.Efros,O.Wang,andE.Shechtman. Toward multimodal image-to-image translation. In Advances in Neural Information Processing Systems, pages 465–476, 2017. | |
dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/21323 | - |
dc.description.abstract | 在不同領域間的非監督式圖片轉換在電腦視覺中是個很重要的問題。這問題最大的挑戰是要保留原圖的內容(content preserving)同時又要能做到風格轉換(style translation)。我們定義了這兩個問題的矛盾點(trade-off)並且定義了我們所要保留的內容(content)是什麼。現有的方法都會受到這個矛盾點(trade-off)的限制。舉例來說,在貓轉狗的問題上,主要的問題是貓跟狗的領域間差異很大,所以現有的方法就會無法生出一個合理的結果。我們提出了一個模型去保留屬於各自領域的內容(domain-specific content feature)來解決這個困難的問題,並且證明了我們提出的方法比起其他現有表現最佳的方法有更優異的表現。 | zh_TW |
dc.description.abstract | Unsupervised Image-to-Image across domains is an important problem in the field of computer vision. The most challenging task in this works is to preserve content information while performing style translation. We define what the content information we need to preserve in the cross domain translation and explore the trade-off between content preservation and style transfer. Current methods suffer from this trade-off and fail to strike a balance between content and style. For instance, the cat-to-dog translation task, the main problem is that there is a large gap between two domains such that current methods failed to generate a reasonable result. We propose a content-preserving model which learn the domain-specific content feature to control this trade-off and demonstrate that the proposed method achieves superior performance to the state-of-the-art methods. | en |
dc.description.provenance | Made available in DSpace on 2021-06-08T03:31:06Z (GMT). No. of bitstreams: 1 ntu-108-R06922062-1.pdf: 4478690 bytes, checksum: ff43a6ce4afd5fad651ca45a3848e1ec (MD5) Previous issue date: 2019 | en |
dc.description.tableofcontents | 誌謝 ii
摘要 iii Abstract iv 1 Introduction 1 2 Related Work 5 2.1 GenerativeAdversarialNetworks...................... 5 2.2 Image-to-ImageTranslation ........................ 5 2.3 StyleTransfer................................ 7 3 Proposed Method 8 3.1 ModelOverview .............................. 8 3.2 Domain-specificContentFeature...................... 9 3.3 Example-guidedTranslation ........................ 12 3.4 OtherLossFunctions............................ 13 4 Implementation 15 4.1 Encoder................................... 15 4.1.1 Contentencoder .......................... 15 4.1.2 Styleencoder............................ 16 4.1.3 MappingLayer........................... 16 4.2 Generator .................................. 16 4.3 TrainingDetails............................... 17 5 Experiments 18 5.1 Datasets................................... 18 5.2 ComparedMethods............................. 19 5.3 EvaluationMetrics ............................. 19 5.4 ComparisonswithState-of-the-Arts .................... 20 5.4.1 Qualitativecomparison. ...................... 20 5.4.2 Quantitativecomparison ...................... 24 5.5 RepresentationDisentanglement ...................... 25 5.5.1 Interpolationinlatentspace .................... 25 5.5.2 Example-guidedgeneration .................... 25 6 Conclusion 27 Bibliography 29 | |
dc.language.iso | en | |
dc.title | 基於跨領域映射達成非監督式內容保留風格導向的圖片轉換 | zh_TW |
dc.title | Unsupervised Content-Preserving Example-Guided Image-to-Image Translation via Domain Mapping | en |
dc.type | Thesis | |
dc.date.schoolyear | 107-2 | |
dc.description.degree | 碩士 | |
dc.contributor.oralexamcommittee | 林彥宇(Yen-Yu Lin),王鈺強(Yu-Chiang Wang),陳駿丞(Jun-Cheng Chen) | |
dc.subject.keyword | 圖片轉換,內容保留,風格導向, | zh_TW |
dc.subject.keyword | Image-to-Imagetranslation,Content-preserving,Example-guided translation, | en |
dc.relation.page | 33 | |
dc.identifier.doi | 10.6342/NTU201903078 | |
dc.rights.note | 未授權 | |
dc.date.accepted | 2019-08-13 | |
dc.contributor.author-college | 電機資訊學院 | zh_TW |
dc.contributor.author-dept | 資訊工程學研究所 | zh_TW |
顯示於系所單位: | 資訊工程學系 |
文件中的檔案:
檔案 | 大小 | 格式 | |
---|---|---|---|
ntu-108-1.pdf 目前未授權公開取用 | 4.37 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。