Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊工程學系
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/80866
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor陳炳宇(Bing-Yu Chen)
dc.contributor.authorLi-Wen Suen
dc.contributor.author蘇俐文zh_TW
dc.date.accessioned2022-11-24T03:19:41Z-
dc.date.available2021-10-04
dc.date.available2022-11-24T03:19:41Z-
dc.date.copyright2021-10-04
dc.date.issued2021
dc.date.submitted2021-09-30
dc.identifier.citationR. Abdal, Y. Qin, and P. Wonka. Image2stylegan: How to embed images into the stylegan latent space? In Proceedings of the IEEE international conference on computer vision, pages 4432–4441, 2019. R. Abdal, Y. Qin, and P. Wonka. Image2stylegan++: How to edit the embedded images? In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 8296–8305, 2020. P. Achlioptas, O. Diamanti, I. Mitliagkas, and L. Guibas. Learning representations and generative models for 3d point clouds. In International conference on machine learning, pages 40–49. PMLR, 2018. Y. Alaluf, O. Patashnik, and D. Cohen-Or. Restyle: A residual-based stylegan encoder via iterative refinement. arXiv preprint arXiv:2104.02699, 2021. D. Bau, H. Strobelt, W. Peebles, J. Wulff, B. Zhou, J.-Y. Zhu, and A. Torralba. Semantic photo manipulation with a generative image prior. arXiv preprint arXiv:2005.07727, 2020. D. Bau, J.-Y. Zhu, H. Strobelt, B. Zhou, J. B. Tenenbaum, W. T. Freeman, and A. Torralba. Gan dissection: Visualizing and understanding generative adversarial networks. arXiv preprint arXiv:1811.10597, 2018. T. Chen, Z. Zhu, A. Shamir, S.-M. Hu, and D. Cohen-Or. 3-sweep: Extracting editable objects from a single photo. ACM Transactions on Graphics (TOG), 32(6):1–10, 2013. Z. Chen and H. Zhang. Learning implicit fields for generative shape modeling. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5939–5948, 2019. C. B. Choy, D. Xu, J. Gwak, K. Chen, and S. Savarese. 3d-r2n2: A unified approach for single and multi-view 3d object reconstruction. In Proceedings of the European Conference on Computer Vision (ECCV), 2016. E. Collins, R. Bala, B. Price, and S. Susstrunk. Editing in style: Uncovering the local semantics of gans. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5771–5780, 2020. A. Creswell and A. A. Bharath. Inverting the generator of a generative adversarial network. IEEE transactions on neural networks and learning systems, 30(7):1967–1974, 2018. P. E. Debevec, C. J. Taylor, and J. Malik. Modeling and rendering architecture from photographs: A hybrid geometry- and image-based approach. In Proceedings of the 23rd Annual Conference on Computer Graphics and Interactive Techniques, SIGGRAPH ’96, page 11–20, New York, NY, USA, 1996. Association for Computing Machinery. Y. Feng, H. Feng, M. J. Black, and T. Bolkart. Learning an animatable detailed 3D face model from in-the-wild images. ACM Transactions on Graphics (ToG), Proc. SIGGRAPH, 40(4):88:1–88:13, Aug. 2021. L. Gao, Y.-K. Lai, J. Yang, Z. Ling-Xiao, S. Xia, and L. Kobbelt. Sparse data driven mesh deformation. IEEE transactions on visualization and computer graphics, 2019. L. Gao, J. Yang, T. Wu, Y.-J. Yuan, H. Fu, Y.-K. Lai, and H. Zhang. Sdm-net: Deep generative network for structured deformable mesh. ACM Transactions on Graphics (TOG), 38(6):1–15, 2019. M. Garon, K. Sunkavalli, S. Hadap, N. Carr, and J.-F. Lalonde. Fast spatially-varying indoor lighting estimation. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2019. J. J. Georgia Gkioxari, Jitendra Malik. Mesh r-cnn. ICCV 2019, 2019. L. Goetschalckx, A. Andonian, A. Oliva, and P. Isola. Ganalyze: Toward visual definitions of cognitive image properties. arXiv preprint arXiv:1906.10112, 2019. J. Gu, Y. Shen, and B. Zhou. Image processing using multi-code gan prior. In CVPR, 2020. S. Guan, Y. Tai, B. Ni, F. Zhu, F. Huang, and X. Yang. Collaborative learning for faster stylegan embedding. arXiv preprint arXiv:2007.01758, 2020. E. Hark ¨ onen, A. Hertzmann, J. Lehtinen, and S. Paris. Ganspace: Discovering interpretable ¨ gan controls. arXiv preprint arXiv:2004.02546, 2020. K. He, G. Gkioxari, P. Dollar, and R. Girshick. Mask r-cnn. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Oct 2017. T. C. L. Hin, I.-C. Shen, I. Sato, and T. Igarashi. Interactive optimization of generative image modeling using sequential subspace search and content-based guidance. Computer Graphics Forum, 2021. M. Huh, R. Zhang, J.-Y. Zhu, S. Paris, and A. Hertzmann. Transforming and projecting images into class-conditional generative networks. In European Conference on Computer Vision, pages 17–34. Springer, 2020. D. Jack, J. K. Pontes, S. Sridharan, C. Fookes, S. Shirazi, F. Maire, and A. Eriksson. Learning free-form deformations for 3d object reconstruction. In Asian Conference on Computer Vision, pages 317–333. Springer, 2018. A. Jahanian, L. Chai, and P. Isola. On the” steerability” of generative adversarial networks. arXiv preprint arXiv:1907.07171, 2019 A. Jahanian, L. Chai, and P. Isola. On the ”steerability” of generative adversarial networks. In International Conference on Learning Representations, 2020. E. Kalogerakis, M. Averkiou, S. Maji, and S. Chaudhuri. 3d shape segmentation with projective convolutional networks. In proceedings of the IEEE conference on computer vision and pattern recognition, pages 3779–3788, 2017. T. Karras, T. Aila, S. Laine, and J. Lehtinen. Progressive growing of gans for improved quality, stability, and variation. arXiv preprint arXiv:1710.10196, 2017. T. Karras, M. Aittala, J. Hellsten, S. Laine, J. Lehtinen, and T. Aila. Training generative adversarial networks with limited data. arXiv preprint arXiv:2006.06676, 2020. T. Karras, M. Aittala, J. Hellsten, S. Laine, J. Lehtinen, and T. Aila. Training generative adversarial networks with limited data. In Proc. NeurIPS, 2020. T. Karras, S. Laine, and T. Aila. A style-based generator architecture for generative adversarial networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4401–4410, 2019. T. Karras, S. Laine, M. Aittala, J. Hellsten, J. Lehtinen, and T. Aila. Analyzing and improving the image quality of stylegan. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 8110–8119, 2020. D. P. Kingma and J. Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014. J. Lei, S. Sridhar, P. Guerrero, M. Sung, N. Mitra, and L. J. Guibas. Pix2surf: Learning parametric 3d surface models of objects from images. In Proceedings of European Conference on Computer Vision (ECCV), August 2020. G. Liu, D. Ceylan, E. Yumer, J. Yang, and J.-M. Lien. Material editing using a physically based rendering network. International Conference on Computer Vision (ICCV) (spotlight), 2017. A. Mao, C. Dai, L. Gao, Y. He, and Y.-j. Liu. Std-net: Structure-preserving and topologyadaptive deformation network for 3d reconstruction from a single image. arXiv preprint arXiv:2003.03551, 2020. D. Maturana and S. Scherer. Voxnet: A 3d convolutional neural network for real-time object recognition. In 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 922–928. IEEE, 2015. K. Mo, P. Guerrero, L. Yi, H. Su, P. Wonka, N. Mitra, and L. Guibas. Structurenet: Hierarchical graph networks for 3d shape generation. ACM Transactions on Graphics (TOG), Siggraph Asia 2019, 38(6):Article 242, 2019. C. Nash and C. K. Williams. The shape variational autoencoder: A deep generative model of part-segmented 3d objects. In Computer Graphics Forum, volume 36, pages 1–12. Wiley Online Library, 2017. J. J. Park, P. Florence, J. Straub, R. Newcombe, and S. Lovegrove. Deepsdf: Learning continuous signed distance functions for shape representation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 165–174, 2019. A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, A. Desmaison, A. Kopf, E. Yang, Z. DeVito, M. Raison, A. Tejani, S. Chilamkurthy, B. Steiner, L. Fang, J. Bai, and S. Chintala. Pytorch: An imperative style, high-performance deep learning library. In H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alche-Buc, E. Fox, and R. Garnett, editors, ´ Advances in Neural Information Processing Systems 32, pages 8024–8035. Curran Associates, Inc., 2019. A. Plumerault, H. L. Borgne, and C. Hudelot. Controlling generative models with continuous factors of variations. arXiv preprint arXiv:2001.10238, 2020. C. R. Qi, H. Su, K. Mo, and L. J. Guibas. Pointnet: Deep learning on point sets for 3d classification and segmentation. arXiv preprint arXiv:1612.00593, 2016. C. R. Qi, L. Yi, H. Su, and L. J. Guibas. Pointnet++: Deep hierarchical feature learning on point sets in a metric space. arXiv preprint arXiv:1706.02413, 2017. Y. Shen, J. Gu, X. Tang, and B. Zhou. Interpreting the latent space of gans for semantic face editing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9243–9252, 2020. Y. Shen, C. Yang, X. Tang, and B. Zhou. Interfacegan: Interpreting the disentangled face representation learned by gans. TPAMI, 2020. Y. Shen and B. Zhou. Closed-form factorization of latent semantics in gans. In CVPR, 2021. N. Snavely, S. M. Seitz, and R. Szeliski. Photo tourism: Exploring photo collections in 3d. In SIGGRAPH Conference Proceedings, pages 835–846, New York, NY, USA, 2006. ACM Press. C. Song, J. Song, and Q. Huang. Hybridpose: 6d object pose estimation under hybrid representations. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2020. N. Spingarn-Eliezer, R. Banner, and T. Michaeli. Gan” steerability” without optimization. arXiv preprint arXiv:2012.05328, 2020. H. Su, H. Fan, and L. Guibas. A Point Set Generation Network for 3D Object Reconstruction from a Single Image. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017. H. Su, V. Jampani, D. Sun, S. Maji, E. Kalogerakis, M.-H. Yang, and J. Kautz. Splatnet: Sparse lattice networks for point cloud processing. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2530–2539, 2018. H. Su, S. Maji, E. Kalogerakis, and E. Learned-Miller. Multi-view convolutional neural networks for 3d shape recognition. In Proceedings of the IEEE international conference on computer vision, pages 945–953, 2015. R. Suzuki, M. Koyama, T. Miyato, T. Yonetsuji, and H. Zhu. Spatially controllable image synthesis with internal representation collaging. arXiv preprint arXiv:1811.10153, 2018. Q. Tan, L. Gao, Y.-K. Lai, and S. Xia. Variational autoencoders for deforming 3d mesh models. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 5841–5850, 2018. A. Tewari, M. Elgharib, F. Bernard, H.-P. Seidel, P. Perez, M. Zollh ´ ofer, and C. Theobalt. ¨ Pie: Portrait image embedding for semantic control. ACM Transactions on Graphics (TOG), 39(6):1–14, 2020. A. Tewari, M. Elgharib, G. Bharaj, F. Bernard, H.-P. Seidel, P. Perez, M. Zollhofer, and ´ C. Theobalt. Stylerig: Rigging stylegan for 3d control over portrait images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6142–6151, 2020. O. Tov, Y. Alaluf, Y. Nitzan, O. Patashnik, and D. Cohen-Or. Designing an encoder for stylegan image manipulation. ACM Transactions on Graphics (TOG), 40(4):1–14, 2021. B. Wang and C. R. Ponce. The geometry of deep generative image models and its applications. arXiv preprint arXiv:2101.06006, 2021. C. Wang, M. Chai, M. He, D. Chen, and J. Liao. Cross-domain and disentangled face manipulation with 3d guidance. arXiv preprint arXiv:2104.11228, 2021. N. Wang, Y. Zhang, Z. Li, Y. Fu, W. Liu, and Y.-G. Jiang. Pixel2mesh: Generating 3d mesh models from single rgb images. In Proceedings of the European Conference on Computer Vision (ECCV), pages 52–67, 2018. J. Wu, Y. Wang, T. Xue, X. Sun, W. T. Freeman, and J. B. Tenenbaum. MarrNet: 3D Shape Reconstruction via 2.5D Sketches. In Advances In Neural Information Processing Systems, 2017. J. Wu, C. Zhang, T. Xue, W. T. Freeman, and J. B. Tenenbaum. Learning a probabilistic latent space of object shapes via 3d generative-adversarial modeling. In Proceedings of the 30th International Conference on Neural Information Processing Systems, pages 82–90, 2016. Z. Wu, D. Lischinski, and E. Shechtman. Stylespace analysis: Disentangled controls for stylegan image generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12863–12872, 2021. J. Zhu, Y. Shen, D. Zhao, and B. Zhou. In-domain gan inversion for real image editing. In Proceedings of European Conference on Computer Vision (ECCV), 2020. J.-Y. Zhu, P. Krahenb ¨ uhl, E. Shechtman, and A. A. Efros. Generative visual manipulation on ¨ the natural image manifold. In Proceedings of European Conference on Computer Vision (ECCV), 2016. Y. Zhu, Y. Zhang, S. Li, and B. Shi. Spatially-varying outdoor lighting estimation from intrinsics. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 12834–12842, June 2021. S. Suzuki et al. Topological structural analysis of digitized binary images by border following. Computer vision, graphics, and image processing, 30(1):32–46, 1985.
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/80866-
dc.description.abstract近期有許多利用對抗生成網路與形狀控制來達成圖像編輯的技術被發表,在形狀控制的例子中,3D Morphable Model (3DMM) 被拿來進行臉部圖像編輯的控制器。然而,目前並沒有一套允許使用者有系統的編輯人造物(例如:椅子、桌子、杯子等)的編輯方法。這其中主要理由是「在圖像上的編輯操作」與「三維操作」沒有一套很好的對應關係。 藉由利用圖像生成模型與三微形狀生成模型的優點,本文提出了一套流程讓使用者能直接於圖像上以物體部位為單位進行編輯,而「如何建立三維形狀特徵空間與圖像生成特徵空間之間的對應關係」即為此方法的重點。在本文提出的方法中,一張輸入圖像會先藉由一個前向對應函數連結到一個相應的三維形狀,使用者可以透過編輯該形狀來引導圖像做出相應的變化。本文展示了許多應用此方法的編輯作業:部位更換、部位大小調整以及視角操作,並進行大量且全面的實驗來評估此方法的有效性。zh_TW
dc.description.provenanceMade available in DSpace on 2022-11-24T03:19:41Z (GMT). No. of bitstreams: 1
U0001-2709202113452600.pdf: 11946344 bytes, checksum: 0dbba1e8d330559ff7cc1b47fd626263 (MD5)
Previous issue date: 2021
en
dc.description.tableofcontents口試委員會審定書 i 致謝 ii 中文摘要 iii Abstract iv List of Figures viii List of Tables xii Chapter 1 Introduction 1 Chapter 2 Related Work 5 2.1 Image-based shape reconstruction and editing 5 2.2 3D Shape Representation and Manipulation 6 2.3 GAN Inversion and Latent Space Manipulation 8 Chapter 3 Method 9 3.1 3D shape attribute parameterization 9 3.2 Cross-domain mapping framework 10 3.2.1 Image inversion 11 3.2.2 From W to S 11 3.2.3 From S to W 12 3.3 Training strategy and loss function 12 3.3.1 Data preparation 12 3.3.2 Training for MF 13 3.3.3 Training for MB 14 3.3.4 Finetuning for MF and MB 14 Chapter 4 Structural Image Editing 18 4.1 Part replacement 18 4.2 Part resizing 19 4.2.1 Resize trajectory 20 4.3 Viewing angle manipulation 21 Chapter 5 Experiment and Results 23 5.1 Implementation details 23 5.2 Image shape reconstruction 24 5.2.1 With/Without size finetuning for MF 24 5.2.2 Finetuning for MF and MB 24 5.3 Part replacement results 25 5.4 Part resizing results 26 5.5 Viewing angle manipulation results 30 5.6 Testing on real images 33 5.7 Overall discussion 33 Chapter 6 Limitation and Future Work 37 6.1 More editing 37 6.2 Fixed number of parts 38 6.3 Entanglement between parts 38 6.4 Complex shape trajectory for trajectory finetuner 38 6.5 Gap between synthesized images and real images 39 Chapter 7 Conclusion 41 Bibliography 42
dc.language.isoen
dc.subject互動設計zh_TW
dc.subject深度學習zh_TW
dc.subject形狀表示zh_TW
dc.subject對抗生成網路zh_TW
dc.subjectGenerative adversarial networksen
dc.subjectShape representationsen
dc.subjectInteraction designen
dc.subjectDeep learningen
dc.titleStylePart: 基於影像之物體形狀組件編輯zh_TW
dc.titleStylePart: Image-based Shape Part Manipulationen
dc.date.schoolyear109-2
dc.description.degree碩士
dc.contributor.oralexamcommittee莊永裕(Hsin-Tsai Liu),王鈺強(Chih-Yang Tseng)
dc.subject.keyword深度學習,對抗生成網路,形狀表示,互動設計,zh_TW
dc.subject.keywordDeep learning,Generative adversarial networks,Shape representations,Interaction design,en
dc.relation.page48
dc.identifier.doi10.6342/NTU202103390
dc.rights.note同意授權(限校園內公開)
dc.date.accepted2021-10-01
dc.contributor.author-college電機資訊學院zh_TW
dc.contributor.author-dept資訊工程學研究所zh_TW
顯示於系所單位:資訊工程學系

文件中的檔案:
檔案 大小格式 
U0001-2709202113452600.pdf
授權僅限NTU校內IP使用(校園外請利用VPN校外連線服務)
11.67 MBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved