以變分自動編碼器與生成對抗式網路對臉部圖片語意分解

Jui-Sheng Liu; 劉叡聲

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/68092

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	王勝德(Sheng-De Wang)
dc.contributor.author	Jui-Sheng Liu	en
dc.contributor.author	劉叡聲	zh_TW
dc.date.accessioned	2021-06-17T02:12:30Z	-
dc.date.available	2019-01-04
dc.date.copyright	2018-01-04
dc.date.issued	2017
dc.date.submitted	2017-12-26
dc.identifier.citation	[1] Matthew D. Zeiler, and Rob Fergus. Visualizing and understanding convolutional networks. In European conference on computer vision. 2014. p. 818-833. [2] Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. Faster R-CNN: Towards real-time object detection with region proposal networks. In Advances in neural information processing systems 2015. p. 91-99. [3] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2016. p. 770-778. [4] Florian Schroff, Dmitry Kalenichenko, and James Philbin. FaceNet: A unified embedding for face recognition and clustering. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015. p. 815-823. [5] Xiang Wu, Ran He, Zhenan Sun, and Tieniu Tan. A Light CNN for Deep Face Representation with Noisy Labels. arXiv preprint arXiv:1511.02683, 2015. [6] Deepak Pathak, Philipp Krahenbuhl, Jeff Donahue, Trevor Darrell, and Alexei A. Efros. Context encoders: Feature learning by inpainting. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016. p. 2536-2544. [7] Richard Zhang, Phillip Isola, Alexei A. Efros. Colorful Image Colorization. In European Conference on Computer Vision. Springer International Publishing, 2016. p. 649-666. [8] Christian Ledig, Lucas Theis, Ferenc Huszar, Jose Caballero, Andrew Cunningham, Alejandro Acosta, et al. Photo-realistic single image super-resolution using a generative adversarial network. arXiv preprint arXiv:1609.04802. [9] Diederik P Kingma, and Max Welling. Auto-encoding variational bayes. In Proceedings of the International Conference on Learning Representations, 2014. [10] Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, et al. Generative adversarial nets. In Advances in neural information processing systems. 2014. p. 2672-2680. [11] Aaron van den Oord, Nal Kalchbrenner, and Koray Kavukcuoglu. Pixel recurrent neural networks. arXiv preprint arXiv:1601.06759, 2016. [12] Chris Donahue, Zachary C. Lipton, Akshay Balsubramani, and Julian McAuley. Semantically Decomposing the Latent Spaces of Generative Adversarial Networks. arXiv preprint arXiv:1705.07904, 2017. [13] Jane Bromley, Isabelle Guyon, Yann LeCun, Eduard Sickinger, and Roopak Shah. Signature verification using a 'siamese' time delay neural network. In Advances in Neural Information Processing Systems. 1994. p. 737-744. [14] Alec Radford, Luke Metz, and Soumith Chintala. Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434, 2015. [15] Tim Salimans, Ian Goodfellow, Wojciech Zaremba, Vicki Cheung, Alec Radford, and Xi Chen. Improved techniques for training gans. In Advances in Neural Information Processing Systems. 2016. p. 2234-2242. [16] Martin Arjovsky, Soumith Chintala, Léon Bottou. Wasserstein gan. arXiv preprint arXiv:1701.07875, 2017. [17] Ishaan Gulrajani, Faruk Ahmed, Martin Arjovsky, Vincent Dumoulin, and Aaron Courville. Improved training of wasserstein gans. arXiv preprint arXiv:1704.00028, 2017. [18] Junbo Zhao, Michael Mathieu, and Yann LeCun. Energy-based generative adversarial network. arXiv preprint arXiv:1609.03126. [19] David Berthelot, Thomas Schumm, and Luke Metz. Began: Boundary equilibrium generative adversarial networks. arXiv preprint arXiv:1703.10717. [20] Mehdi Mirza, and Simon Osindero. Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784, 2014. [21] Augustus Odena, Christopher Olah, and Jonathon Shlens. Conditional image synthesis with auxiliary classifier gans. arXiv preprint arXiv:1610.09585, 2016. [22] Xi Chen, Yan Duan, Rein Houthooft, John Schulman, Ilya Sutskever, and Pieter Abbeel. Infogan: Interpretable representation learning by information maximizing generative adversarial nets. In Advances in Neural Information Processing Systems. 2016. p. 2172-2180. [23] Yaniv Taigman, Adam Polyak, and Lior Wolf. Unsupervised cross-domain image generation. arXiv preprint arXiv:1611.02200, 2016. [24] Guim Perarnau, Joost van de Weijer, Bogdan Raducanu, and Jose M. Álvarez. Invertible conditional GANs for image editing. arXiv preprint arXiv:1611.06355, 2016. [25] Wei Shen, and Rujie Liu. Learning residual images for face attribute manipulation. arXiv preprint arXiv:1612.05363, 2016. [26] Jianmin Bao, Dong Chen, Fang Wen, Houqiang Li, and Gang Hua. CVAE-GAN: fine-grained image generation through asymmetric training. arXiv preprint arXiv:1703.10155, 2017. [27] Yongyi Lu, Yu-Wing Tai, and Chi-Keung Tang. Conditional cycleGAN for attribute guided face image generation. arXiv preprint arXiv:1705.09966, 2017. [28] Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A. Efros. Unpaired image-to-image translation using cycle-consistent adversarial networks. arXiv preprint arXiv:1703.10593, 2017. [29] Luan Tran, Xi Yin, and Xiaoming Liu. Disentangled representation learning gan for pose-invariant face recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017. p. 7. [30] Anders Boesen Lindbo Larsen, Søren Kaae Sønderby, Hugo Larochelle, and Ole Winther. Autoencoding beyond pixels using a learned similarity metric. arXiv preprint arXiv:1512.09300, 2015. [31] Leon A. Gatys, Alexander S. Ecker, and Matthias Bethge. A neural algorithm of artistic style. arXiv preprint arXiv:1508.06576, 2015. [32] Florian Schroff, Dmitry Kalenichenko, and James Philbina. Facenet: A unified embedding for face recognition and clustering. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015. p. 815-823. [33] Yandong Guo, Lei Zhang, Yuxiao Hu, Xiaodong He, and Jianfeng Gao. Ms-celeb-1m: A dataset and benchmark for large-scale face recognition. In European Conference on Computer Vision. Springer International Publishing, 2016. p. 87-102. [34] Diederik P. Kingma, and Jimmy Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/68092	-
dc.description.abstract	語意分解生成對抗式網路是從生成對抗式網路衍生出的一種新型架構，它將臉部圖片在語意上解構為代表一個人獨特五官的「身分」以及代表剩餘所有其他面部特徵如：角度、光照、髮型……等的「觀測」。我們基於其概念將生成對抗式網路進一步與變分自動編碼器結合，提出一個新的網路模型。在此模型中，變分自動編碼器能在編碼階段將一臉部圖片編碼至一高斯分布的向量空間、同時分解為身份與觀測兩部分，再從這兩部分重建回原圖；而生成對抗網路則能夠利用其對抗性使得解碼器產生的圖片能夠有接近真實照片的品質。這樣的架構使這個模型能夠更直觀更有效率地從隨機分布或者是現有圖片的身分和觀測來產生新的臉部圖片。在模型的驗證上，我們展示了身分與觀測對於生成圖片的影響，並證明了在語意分解上的成功。	zh_TW
dc.description.abstract	As one of state-of-the-art generative adversarial models, SDGANs can generate face images from two part of semantic meanings, “identity” and “observation”. In its statement, identity stands for a person’s unique facial features, and observation stands for all the other features including pose, lighting, color of hair, etc. We extend and combine SDGANs with variational autoencoder, introducing a new network architecture. In this model, variational autoencoder part can encode a face image to vector space of a Gaussian distribution, decompose to identity and observation part in encoding phase, and reconstruct it in decoding phase. The generative adversarial network part can enhance the quality of images generated by decoder, making it photo-realistic. This architecture enable the model to generate new image from either a random distribution or from identity or observation component of an existing image more intuitively and more efficiently. To verify our model, we demonstrate how identity and observation affect generated images and prove the success in semantic decomposition.	en
dc.description.provenance	Made available in DSpace on 2021-06-17T02:12:30Z (GMT). No. of bitstreams: 1 ntu-106-R04921055-1.pdf: 1790739 bytes, checksum: 0816f26b065d4d923ae0d6e3a281240c (MD5) Previous issue date: 2017	en
dc.description.tableofcontents	誌謝 i 中文摘要 ii ABSTRACT iii CONTENTS iv LIST OF FIGURES vi LIST OF TABLES vii Chapter 1 Introduction 1 1.1 Overview of Generative Models 2 1.2 Motivation 3 1.3 Contribution 5 1.4 Thesis Organization 6 Chapter 2 Related Work 7 2.1 Development of GAN 7 2.2 Face images with GAN 8 2.3 Basis of our work 9 Chapter 3 Methodology 10 3.1 Variational Autoencoder 10 3.2 Generative Adversarial Network 12 3.3 Semantically Decomposing VAEGAN 14 3.4 Objective 16 3.4.1 Loss terms of autoencoder 17 3.4.2 Loss terms of encoder 18 3.4.3 Loss terms of discriminator and Identifier 19 3.4.4 Loss terms of generator 20 3.4.5 Algorithm 20 Chapter 4 Experiments 23 4.1 Dataset 23 4.2 Detailed network architecture 24 4.3 Generating result 27 Chapter 5 Conclusion 31 REFERENCE 32
dc.language.iso	en
dc.subject	生成對抗式網路	zh_TW
dc.subject	變分自動編碼器	zh_TW
dc.subject	臉部語意	zh_TW
dc.subject	語意分解	zh_TW
dc.subject	生成模型	zh_TW
dc.subject	generative model	en
dc.subject	generative adversarial network	en
dc.subject	variational autoencoder	en
dc.subject	facial semantic	en
dc.subject	semantically decomposition	en
dc.title	以變分自動編碼器與生成對抗式網路對臉部圖片語意分解	zh_TW
dc.title	Face image semantically decomposing with variational autoencoder and generative adversarial network	en
dc.type	Thesis
dc.date.schoolyear	106-1
dc.description.degree	碩士
dc.contributor.oralexamcommittee	雷欽隆(Chin-Laung Lei),曾俊元
dc.subject.keyword	生成對抗式網路,變分自動編碼器,臉部語意,語意分解,生成模型,	zh_TW
dc.subject.keyword	generative adversarial network,variational autoencoder,facial semantic,semantically decomposition,generative model,	en
dc.relation.page	36
dc.identifier.doi	10.6342/NTU201704487
dc.rights.note	有償授權
dc.date.accepted	2017-12-26
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	電機工程學研究所	zh_TW
顯示於系所單位：	電機工程學系

文件中的檔案：

檔案	大小	格式
ntu-106-1.pdf 未授權公開取用	1.75 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。