Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊網路與多媒體研究所
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/70884
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor莊永裕(Yung-Yu Chuang)
dc.contributor.authorYu Tsengen
dc.contributor.author曾瑀zh_TW
dc.date.accessioned2021-06-17T04:42:20Z-
dc.date.available2021-08-10
dc.date.copyright2018-08-10
dc.date.issued2018
dc.date.submitted2018-08-05
dc.identifier.citationBibliography
[1] P. J. Burt and E. H. Adelson. The laplacian pyramid as a compact image code. In Readings in Computer Vision, pages 671–679. Elsevier, 1987.
[2] P. Debevec, T. Hawkins, C. Tchou, H.-P. Duiker, W. Sarokin, and M. Sagar. Acquiring the reflectance field of a human face. In Proceedings of the 27th annual conference on Computer graphics and interactive techniques, pages 145–156. ACM Press/Addison-Wesley Publishing Co., 2000.
[3] J. Gauthier. Conditional generative adversarial nets for convolutional face generation. Class Project for Stanford CS231N: Convolutional Neural Networks for Visual Recognition, Winter semester, 2014(5):2, 2014.
[4] A. S. Georghiades, P. N. Belhumeur, and D. J. Kriegman. Illumination-based image synthesis: Creating novel images of human faces under differing pose and lighting. In Multi-View Modeling and Analysis of Visual Scenes, 1999.(MVIEW’99) Proceedings. IEEE Workshop on, pages 47–54. IEEE, 1999.
[5] S. Ioffe and C. Szegedy. Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167, 2015.
[6] P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros. Image-to-image translation with conditional adversarial networks. arxiv, 2016.
[7] J. Johnson, A. Alahi, and L. Fei-Fei. Perceptual losses for real-time style transfer and super-resolution. In European Conference on Computer Vision, pages 694–711. Springer, 2016.
[8] A. Kae, K. Sohn, H. Lee, and E. Learned-Miller. Augmenting CRFs with Boltzmann machine shape priors for image labeling. 2013.
[9] S. Liu, J. Yang, C. Huang, and M.-H. Yang. Multi-objective convolutional learning for face labeling. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2015.
[10] Z. Liu, Y. Shan, and Z. Zhang. Expressive expression mapping with ratio images. In Proceedings of the 28th annual conference on Computer graphics and interactive techniques, pages 271–276. ACM, 2001.
[11] A. Oliva, A. Torralba, and P. G. Schyns. Hybrid images. In ACM Transactions on Graphics (TOG), volume 25, pages 527–532. ACM, 2006.
[12] D. Pathak, P. Krahenbuhl, J. Donahue, T. Darrell, and A. A. Efros. Context encoders: Feature learning by inpainting. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2536–2544, 2016.
[13] P. Paysan, R. Knothe, B. Amberg, S. Romdhani, and T. Vetter. A 3d face model for pose and illumination invariant face recognition. In Proceedings of the 6th IEEE International Conference on Advanced Video and Signal based Surveillance (AVSS) for Security, Safety and Monitoring in Smart Environments, Genova, Italy, 2009. IEEE.
[14] O. Ronneberger, P. Fischer, and T. Brox. U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical image computing and computer-assisted intervention, pages 234–241. Springer, 2015.
[15] S. Sengupta, A. Kanazawa, C. D. Castillo, and D. W. Jacobs. Sfsnet: Learning shape, refectance and illuminance of faces in the wild. In Computer Vision and Pattern Regognition (CVPR), 2018.
[16] Y. Shih, S. Paris, C. Barnes, W. T. Freeman, and F. Durand. Style transfer for headshot portraits. In ACM Transactions on Graphics (TOG), volume 33, page 148. ACM, 2014.
[17] Z. Shu, E. Yumer, S. Hadap, K. Sunkavalli, E. Shechtman, and D. Samaras. Neural face editing with intrinsic image disentangling. In Computer Vision and Pattern Recognition (CVPR), 2017 IEEE Conference on, pages 5444–5453. IEEE, 2017.
[18] X. Wang and A. Gupta. Generative image modeling using style and structure adversarial networks. In European Conference on Computer Vision, pages 318–335. Springer, 2016.
[19] Z. Wen, Z. Liu, and T. S. Huang. Face relighting with radiance environment maps. In null, page 158. IEEE, 2003.
[20] B. Zhou, A. Lapedriza, J. Xiao, A. Torralba, and A. Oliva. Learning deep features for scene recognition using places database. In Advances in neural information processing systems, pages 487–495, 2014.
[21] X. Zhu, Z. Lei, X. Liu, H. Shi, and S. Z. Li. Face alignment across large poses: A 3d solution. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 146–155, 2016.
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/70884-
dc.description.abstract這篇論文提出了一個人臉的光影重建系統,將參考的人臉影像之打光模式重建到目標人臉影像上。我們提出了一個深度學習的架構,並使用有兩個輸入分支的U-Net模組,使我們可以同時對參考人臉影像以及目標人臉影像進行學習。我們的學習目標不定在色彩空間上,而是讓模組學習人臉的拉普拉斯金字塔。假設金字塔的高頻段保留了更多材質資訊,我們將學習目標定在金字塔的低頻部分,並保留原始的高頻部分。實驗結果顯示這樣的設計可以更好的適應不同色光並保留更多細節。我們亦將人臉的深度影像以及人臉的面部剖析分割圖與人臉影像串聯做為輸入,提供給模組更多的人臉幾何資訊。
我們進行了讀者研究來評估我們的系統。在研究中,我們要求使用者指出重建光影後的影像其光線來向。此研究中我們達到了82.4%的正確辨識率。
zh_TW
dc.description.abstractThis thesis presents a human face relighting system, which relights the target human face image in a specific lighting pattern according to the reference face image. We propose a 2-Input-Stream U-Net structure to learn from the target face and reference face simultaneously. Instead of learning the face on color space, we operate on frequency space by learning the Laplacian Pyramid of the face. Under the assumption that the high-frequency part preserves more information of textures, we only learn the low-frequency part and the high-frequency part remains the same. The experiment result shows that the model is more adaptive to lighting color and preserves more details. We also concatenate the depth map and face parsing segmentation map to the input stack to provide facial geometry information.
To evaluate our model, we conduct a user study by asking users to identify the lighting direction of the relit image. We obtained 82.4% accuracy on this survey.
en
dc.description.provenanceMade available in DSpace on 2021-06-17T04:42:20Z (GMT). No. of bitstreams: 1
ntu-107-R05944002-1.pdf: 5054303 bytes, checksum: 724585d66561720d9545aff81120c857 (MD5)
Previous issue date: 2018
en
dc.description.tableofcontents致謝 i
摘要 ii
Abstract iii
List of Figures vii
List of Tables ix
Chapter 1 Introduction 1
Chapter 2 Related Work 3
2.1 Classical Inverse Rendering 3
2.2 Face Intrinsic Decomposition using Neural Network 4
2.3 Face Editing by Ratio Image 4
Chapter 3 Methodology 5
3.1 Lighting Patterns 5
3.2 Dataset 6
3.2.1 Training Data Triplet 6
3.2.2 Data Augmentation 8
3.3 Model 9
3.3.1 2-Input-Stream U-Net Architecture 9
3.3.2 Multiple Input 11
3.3.3 Loss Function 13
3.4 Training Details 17
3.4.1 Training Data 18
3.4.2 Implementation Details 19
Chapter 4 Experiment and Result 20
4.1 Evaluation on Synthetic Data 20
4.2 Result on Flickr2 Dataset 21
4.3 Comparison over Different Output Types 22
4.4 Comparison over Different Input Types 23
4.5 User Study 24
Chapter 5 Discussion 26
5.1 Failure Cases 26
5.2 Limitation 27
Chapter 6 Conclusion 28
Bibliography 29
dc.language.isoen
dc.subject深度影像zh_TW
dc.subject拉普拉斯金字塔zh_TW
dc.subject打光模式zh_TW
dc.subject人臉光影重建zh_TW
dc.subject人臉剖析zh_TW
dc.subjectface relightingen
dc.subjectFace parsingen
dc.subjectDepth mapen
dc.subjectLaplacian Pyramiden
dc.subjectlighting patternen
dc.title基於拉普拉斯金字塔的深度人臉影像光影重建zh_TW
dc.titleDeep Face Relighting using Laplacian Pyramiden
dc.typeThesis
dc.date.schoolyear106-2
dc.description.degree碩士
dc.contributor.oralexamcommittee林彥宇(Yen-Yu Lin),陳煥宗(Hwann-Tzong Chen)
dc.subject.keyword人臉光影重建,打光模式,拉普拉斯金字塔,深度影像,人臉剖析,zh_TW
dc.subject.keywordface relighting,lighting pattern,Laplacian Pyramid,Depth map,Face parsing,en
dc.relation.page31
dc.identifier.doi10.6342/NTU201802521
dc.rights.note有償授權
dc.date.accepted2018-08-06
dc.contributor.author-college電機資訊學院zh_TW
dc.contributor.author-dept資訊網路與多媒體研究所zh_TW
顯示於系所單位:資訊網路與多媒體研究所

文件中的檔案:
檔案 大小格式 
ntu-107-1.pdf
  未授權公開取用
4.94 MBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved