Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 電信工程學研究所
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/74777
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor王鈺強
dc.contributor.authorJia-Wei Yanen
dc.contributor.author顏嘉緯zh_TW
dc.date.accessioned2021-06-17T09:07:23Z-
dc.date.available2023-12-25
dc.date.copyright2019-12-25
dc.date.issued2019
dc.date.submitted2019-12-04
dc.identifier.citation[1] D. Berthelot, C. Raffel, A. Roy, and I. Goodfellow. Understanding and improving interpolation in autoencoders via an adversarial regularizer. In Proceedings of the International Conference on Learning Representations (ICLR), 2019.
[2] X. Chen, Y. Duan, R. Houthooft, J. Schulman, I. Sutskever, and P. Abbeel. Infogan: Interpretable representation learning by information maximizing generative adversarial nets. In Advances in Neural Information Processing Systems (NIPS), 2016.
[3] J. Deng, J. Guo, N. Xue, and S. Zafeiriou. Arcface: Additive angular margin loss for deep face recognition. arXiv preprint arXiv:1801.07698, 2019.
[4] J. Donahue, P. Krähenbühl, and T. Darrell. Adversarial feature learning. In Proceedings of the International Conference on Learning Representations (ICLR), 2017.
[5] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair,
A. Courville, and Y. Bengio. Generative adversarial nets. In Advances in Neural
Information Processing Systems (NIPS), pages 2672–2680, 2014.
[6] R. Gross, I. Matthews, J. Cohn, T. Kanade, and S. Baker. Multi-pie. Image and
Vision Computing, 28(5):807–813, 2010.
[7] R. Hadsell, S. Chopra, and Y. LeCun. Dimensionality reduction by learning an invariant mapping. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2006.
[8] X. He, Y. Zhou, Z. Zhou, S. Bai, and X. Bai. Triplet-center loss for multi-view 3d object retrieval. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
[9] I. Higgins, L. Matthey, A. Pal, C. Burgess, X. Glorot, M. Botvinick, S. Mohamed, and A. Lerchner. beta-vae: Learning basic visual concepts with a constrained variational framework. In Proceedings of the International Conference on Learning Representations (ICLR), 2017.
[10] X. Huang, M.-Y. Liu, S. Belongie, and J. Kautz. Multimodal unsupervised image-toimage translation. In Proceedings of the European Conference on Computer Vision (ECCV), 2018.
[11] D. P. Kingma and M. Welling. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114, 2013.
[12] Y. LeCun, L. Bottou, Y. Bengio, P. Haffner, et al. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, 1998.
[13] H.-Y. Lee, H.-Y. Tseng, J.-B. Huang, M. K. Singh, and M.-H. Yang. Diverse imageto-image translation via disentangled representations. In Proceedings of the European Conference on Computer Vision (ECCV), 2018.
[14] Z. Li, C. Xu, and B. Leng. Angular triplet-center loss for multi-view 3d shape retrieval. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2019.
[15] M.-Y. Liu, T. Breuel, and J. Kautz. Unsupervised image-to-image translation networks. In Advances in Neural Information Processing Systems (NIPS), 2017.
[16] W. Liu, Y. Wen, Z. Yu, M. Li, B. Raj, and L. Song. Sphereface: Deep hypersphere embedding for face recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
[17] W. Liu, Y. Wen, Z. Yu, and M. Yang. Large-margin softmax loss for convolutional neural networks. In Proceedings of the International Conference on Machine Learning (ICML), 2016.
[18] A. Makhzani, J. Shlens, N. Jaitly, I. Goodfellow, and B. Frey. Adversarial autoencoders. arXiv preprint arXiv:1511.05644, 2015.
[19] A. Paszke, S. Gross, S. Chintala, G. Chanan, E. Yang, Z. DeVito, Z. Lin, A. Desmaison, L. Antiga, and A. Lerer. Automatic differentiation in pytorch. In Advances in Neural Information Processing Systems Workshops (NIPS Workshops), 2017.
[20] F. Schroff, D. Kalenichenko, and J. Philbin. Facenet: A unified embedding for face recognition and clustering. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015.
[21] P. Vincent, H. Larochelle, I. Lajoie, Y. Bengio, and P.-A. Manzagol. Stacked denoising autoencoders: Learning useful representationsina deep network with a local denoising criterion. Journal of Machine Learning Research, 2010.
[22] H. Wang, Y. Wang, Z. Zhou, X. Ji, D. Gong, J. Zhou, Z. Li, and W. Liu. Cosface: Large margin cosine loss for deep face recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
[23] Y. Wen1, K. Zhang1, Z. Li1, and Y. Qiao. A discriminative feature learning approachfor deep face recognition. In Proceedings of the European Conference on
Computer Vision (ECCV), 2016.
[24] Y. Zheng, D. K. Pal, and M. Savvides. Ring loss: Convex feature normalization for face recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/74777-
dc.description.abstract學習可解釋的特徵表示一直是感興趣的主題之一。 大多數現有的研究不能容易地生成或操縱特徵表示,並通過插值生成具有特定語義的圖像。 在本文中,我們提出了一個角度三元組鄰居函數(ATNL),它能夠導出其分佈與語義資訊匹配的潛在特徵表示。 利用ATNL引導的潛在特徵空間,我們進一步利用球面語義內插來生成語義變化的圖像。 我們對MNIST和CMU Multi-PIE數據集的實驗證實了我們的ATNL和球形語義內插對最近的表示學習模型的有效性和強大性。zh_TW
dc.description.abstractLearning interpretable representations has been among the topics of interest. Most existing works cannot easily generate or manipulate latent representations which semantically match the images of interest via interpolation. In this paper, we propose an Angular Triplet-Neighbor Loss (ATNL), which is able to derive latent representations whose distribution would match the semantic information. With the latent space guided by ATNL, we further utilize spherical semantic interpolation for generating semantic warping of images. Our experiments on both MNIST and CMU Multi-PIE datasets confirm the effectiveness and robustness of our ATNL and spherical semantic interpolation over recent representation learning models.en
dc.description.provenanceMade available in DSpace on 2021-06-17T09:07:23Z (GMT). No. of bitstreams: 1
ntu-108-R06942033-1.pdf: 7043859 bytes, checksum: fbd1c6ac7a4bccaea2e296aa07bc2cb3 (MD5)
Previous issue date: 2019
en
dc.description.tableofcontents誌謝iii
Acknowledgements v
摘要vii
Abstract ix
1 Introduction 1
2 Related Work 5
3 Semantics-Guided Representation Learning 7
3.1 VAE for Representation Learning . . . . . . . . . . . . . . . . . . . . . . 7
3.2 Angular Triplet-Neighbor Loss (ATNL) . . . . . . . . . . . . . . . . . . 8
3.3 Semantics-Guided Image Generation . . . . . . . . . . . . . . . . . . . . 11
3.4 Learning of Our Framework . . . . . . . . . . . . . . . . . . . . . . . . 12
3.5 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.5.1 Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.5.2 Implementation Details . . . . . . . . . . . . . . . . . . . . . . . 13
3.5.3 Visualization via t-SNE projection . . . . . . . . . . . . . . . . . 13
3.5.4 Image Generation via Linear/Spherical Semantic Interpolation . . 14
3.5.5 Quantitative Evaluation . . . . . . . . . . . . . . . . . . . . . . 17
3.5.6 Assessment of Interpolated Images . . . . . . . . . . . . . . . . 17
3.5.7 Ablation Studies . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.5.8 Analysis of ATNL . . . . . . . . . . . . . . . . . . . . . . . . . 21
4 Conclusion 23
Bibliography 25
dc.language.isoen
dc.subject影像生成zh_TW
dc.subject電腦視覺zh_TW
dc.subject深度學習zh_TW
dc.subject特徵學習zh_TW
dc.subject機器學習zh_TW
dc.subjectImage generationen
dc.subjectRepresentation learningen
dc.subjectDeep learningen
dc.subjectMachine learningen
dc.subjectComputer visionen
dc.title具有語義引導的特徵學習並應用於視覺化影像生成zh_TW
dc.titleSemantics-Guided Representation Learningwith Applications to Visual Synthesisen
dc.typeThesis
dc.date.schoolyear108-1
dc.description.degree碩士
dc.contributor.oralexamcommittee陳祝嵩,陳駿丞
dc.subject.keyword影像生成,特徵學習,深度學習,機器學習,電腦視覺,zh_TW
dc.subject.keywordImage generation,Representation learning,Deep learning,Machine learning,Computer vision,en
dc.relation.page27
dc.identifier.doi10.6342/NTU201904354
dc.rights.note有償授權
dc.date.accepted2019-12-04
dc.contributor.author-college電機資訊學院zh_TW
dc.contributor.author-dept電信工程學研究所zh_TW
顯示於系所單位:電信工程學研究所

文件中的檔案:
檔案 大小格式 
ntu-108-1.pdf
  未授權公開取用
6.88 MBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved