Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊工程學系
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/83080
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor莊永裕zh_TW
dc.contributor.advisorYung-Yu Chuangen
dc.contributor.author劉育綸zh_TW
dc.contributor.authorYu-Lun Liuen
dc.date.accessioned2023-01-06T17:05:23Z-
dc.date.available2023-11-09-
dc.date.copyright2023-01-06-
dc.date.issued2022-
dc.date.submitted2022-12-21-
dc.identifier.citation[1] A. Agarwala, M. Dontcheva, M. Agrawala, S. Drucker, A. Colburn, B. Curless, D. Salesin, and M. Cohen. Interactive digital photomontage. In ACM TOG, 2004.
[2] A. O. Akyüz, R. Fleming, B. E. Riecke, E. Reinhard, and H. H. Bülthoff. Do hdr displays support ldr content?: A psychophysical evaluation. ACM TOG, 2007.
[3] J.-B. Alayrac, J. Carreira, and A. Zisserman. The visual centrifuge: Model-free layered video representations. In CVPR, 2019.
[4] K.-A. Aliev, D. Ulyanov, and V. Lempitsky. Neural point-based graphics. In ECCV, 2020.
[5] N. Arvanitopoulos, R. Achanta, and S. Susstrunk. Single image reflection suppression. In CVPR, 2017.
[6] C. Bailer, B. Taetz, and D. Stricker. Flow fields: Dense correspondence fields for highly accurate large displacement optical flow estimation. In ICCV, 2015.
[7] S. Baker, D. Scharstein, J. Lewis, S. Roth, M. J. Black, and R. Szeliski. A database and evaluation methodology for optical flow. IJCV, 92(1):1–31, 2011.
[8] L. Ballan, G. J. Brostow, J. Puwein, and M. Pollefeys. Unstructured video-based rendering: Interactive exploration of casually captured videos. ACM TOG, 2010.
[9] A. Bansal, M. Vo, Y. Sheikh, D. Ramanan, and S. Narasimhan. 4d visualization of dynamic events from unconstrained multi-view videos. In CVPR, 2020.
[10] F. Banterle, K. Debattista, A. Artusi, S. Pattanaik, K. Myszkowski, P. Ledda, and A. Chalmers. High dynamic range imaging and low dynamic range expansion for generating HDR content. Computer Graphics Forum, 2009.
[11] F. Banterle, P. Ledda, K. Debattista, and A. Chalmers. Inverse tone mapping. In International conference on Computer graphics and interactive techniques in Australasia and Southeast Asia, 2006.
[12] F. Banterle, P. Ledda, K. Debattista, and A. Chalmers. Expanding low dynamic range videos for high dynamic range applications. In Spring Conference on Computer Graphics, 2008.
[13] F. Banterle, P. Ledda, K. Debattista, A. Chalmers, and M. Bloj. A framework for inverse tone mapping. The Visual Computer, 2007.
[14] C. Barnes, E. Shechtman, A. Finkelstein, and D. B. Goldman. PatchMatch: A randomized correspondence algorithm for structural image editing. ACM TOG, 2009.
[15] J. T. Barron, B. Mildenhall, M. Tancik, P. Hedman, R. Martin-Brualla, and P. P. Srinivasan. Mip-nerf: A multiscale representation for anti-aliasing neural radiance fields. In ICCV, 2021.
[16] J. T. Barron, B. Mildenhall, D. Verbin, P. P. Srinivasan, and P. Hedman. Mip-nerf 360: Unbounded anti-aliased neural radiance fields. In CVPR, 2022.
[17] E. Be’Ery and A. Yeredor. Blind separation of superimposed shifted images using parameterized joint diagonalization. TIP, 17(3):340–353, 2008.
[18] S. Bell, K. Bala, and N. Snavely. Intrinsic images in the wild. ACM TOG, 33(4):159, 2014.
[19] Y. Boykov and V. Kolmogorov. An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision. TPAMI, 2004.
[20] R. W. Brislin. Back-translation for cross-cultural research. Journal of cross-cultural psychology, 1970.
[21] M. Brown and D. G. Lowe. Recognising panoramas. In ICCV, 2003.
[22] M. S. Brown and S. Kim. Understanding the in-camera image processing pipeline for computer vision. In IEEE International Conference on Computer Vision (ICCV)-Tutorial, volume 3, 2019.
[23] M. Broxton, J. Flynn, R. Overbeck, D. Erickson, P. Hedman, M. Duvall, J. Dourgarian, J. Busch, M. Whalen, and P. Debevec. Immersive light field video with a layered mesh representation. ACM Transactions on Graphics (TOG), 2020.
[24] C. Buehler, M. Bosse, and L. McMillan. Non-metric image-based rendering for video stabilization. In CVPR, 2001.
[25] C. Buehler, M. Bosse, L. McMillan, S. Gortler, and M. Cohen. Unstructured lumigraph rendering. In SIGGRAPH, 2001.
[26] D. J. Butler, J. Wulff, G. B. Stanley, and M. J. Black. A naturalistic open source movie for optical flow evaluation. In ECCV, 2012.
[27] J. Canny. A computational approach to edge detection. TPAMI, 8(6):679–698, 1986.
[28] J. Carranza, C. Theobalt, M. A. Magnor, and H.-P. Seidel. Free-viewpoint video of human actors. ACM TOG, 2003.
[29] A. Chakrabarti, Y. Xiong, B. Sun, T. Darrell, D. Scharstein, T. Zickler, and K. Saenko. Modeling radiometric uncertainty for vision with tone-mapped color images. TPAMI, 2014.
[30] G. Chaurasia, S. Duchene, O. Sorkine-Hornung, and G. Drettakis. Depth synthesis and local warps for plausible image-based navigation. ACM TOG, 2013.
[31] A. Chen, Z. Xu, A. Geiger, J. Yu, and H. Su. Tensorf: Tensorial radiance fields. In ECCV, 2022.
[32] S. E. Chen and L. Williams. View interpolation for image synthesis. In SIGGRAPH, 1993.
[33] Y. Chen, C. Schmid, and C. Sminchisescu. Self-supervised learning with geometric constraints in monocular video: Connecting flow, depth, and camera. In ICCV, 2019.
[34] Y.-C. Chen, C. Gao, E. Robb, and J.-B. Huang. Nas-dip: Learning deep image prior with neural architecture search. In ECCV, 2020.
[35] I. Choi, O. Gallo, A. Troccoli, M. H. Kim, and J. Kautz. Extreme view synthesis. In ICCV, 2019.
[36] J. Choi and I. S. Kweon. Deep iterative frame interpolation for full-frame video stabilization. ACM TOG, 2020.
[37] A. Collet, M. Chuang, P. Sweeney, D. Gillett, D. Evseev, D. Calabrese, H. Hoppe, A. Kirk, and S. Sullivan. High-quality streamable free-viewpoint video. ACM TOG, 2015.
[38] S. J. Daly and X. Feng. Decontouring: Prevention and removal of false contour artifacts. In Human Vision and Electronic Imaging IX, 2004.
[39] D.-T. Dang-Nguyen, C. Pasquini, V. Conotter, and G. Boato. Raise: A raw images dataset for digital image forensics. In ACM MM, 2015.
[40] P. Debevec. Image-based lighting. In ACM SIGGRAPH 2006 Courses, pages 4–es. 2006.
[41] P. E. Debevec and J. Malik. Recovering high dynamic range radiance maps from photographs. ACM TOG, 1997.
[42] P. E. Debevec, C. J. Taylor, and J. Malik. Modeling and rendering architecture from photographs: A hybrid geometry-and image-based approach. In SIGGRAPH, 1996.
[43] A. Dosovitskiy, P. Fischer, E. Ilg, P. Hausser, C. Hazirbas, V. Golkov, P. van der Smagt, D. Cremers, and T. Brox. FlowNet: Learning optical flow with convolutional networks. In ICCV, 2015.
[44] M. Dou, S. Khamis, Y. Degtyarev, P. Davidson, S. R. Fanello, A. Kowdle, S. O. Escolano, C. Rhemann, D. Kim, J. Taylor, et al. Fusion4d: Real-time performance capture of challenging scenes. ACM TOG, 2016.
[45] R. A. Drebin, L. Carpenter, and P. Hanrahan. Volume rendering. ACM TOG, 1988.
[46] C. Du, B. Kang, Z. Xu, J. Dai, and T. Nguyen. Accurate and efficient video defencing using convolutional neural networks and temporal information. In ICME, 2018.
[47] A. A. Efros and W. T. Freeman. Image quilting for texture synthesis and transfer. ACM TOG, 2001.
[48] D. Eigen, C. Puhrsch, and R. Fergus. Depth map prediction from a single image using a multi-scale deep network. In NIPS, 2014.
[49] G. Eilertsen, J. Kronander, G. Denes, R. K. Mantiuk, and J. Unger. HDR image reconstruction from a single exposure using deep CNNs. ACM TOG, 2017.
[50] E. Eisemann and F. Durand. Flash photography enhancement via intrinsic relighting. ACM TOG, 23(3):673–678, 2004.
[51] Y. Endo, Y. Kanamori, and J. Mitani. Deep reverse tone mapping. ACM TOG, 2017.
[52] J. Engel, V. Koltun, and D. Cremers. Direct sparse odometry. IEEE TPAMI, 2017.
[53] J. Engel, T. Schöps, and D. Cremers. Lsd-slam: Large-scale direct monocular slam. In ECCV, 2014.
[54] Q. Fan, J. Yang, G. Hua, B. Chen, and D. Wipf. A generic deep architecture for single image reflection removal and image smoothing. In ICCV, 2017.
[55] J. Fang, T. Yi, X. Wang, L. Xie, X. Zhang, W. Liu, M. Nießner, and Q. Tian. Fast dynamic radiance fields with time-aware neural voxels. ACM TOG, 2022.
[56] C. Finn, P. Abbeel, and S. Levine. Model-agnostic meta-learning for fast adaptation of deep networks. arXiv preprint arXiv:1703.03400, 2017.
[57] J. Flynn, M. Broxton, P. Debevec, M. DuVall, G. Fyffe, R. Overbeck, N. Snavely, and R. Tucker. Deepview: View synthesis with learned gradient descent. In CVPR, 2019.
[58] J. Flynn, I. Neulander, J. Philbin, and N. Snavely. Deepstereo: Learning to predict new views from the world’s imagery. In CVPR, 2016.
[59] S. Fridovich-Keil, A. Yu, M. Tancik, Q. Chen, B. Recht, and A. Kanazawa. Plenoxels: Radiance fields without neural networks. In CVPR, 2022.
[60] D. Gadot and L. Wolf. PatchBatch: a batch augmented loss for optical flow. In CVPR, 2016.
[61] K. Gai, Z. Shi, and C. Zhang. Blind separation of superimposed images with unknown motions. In CVPR, 2009.
[62] K. Gai, Z. Shi, and C. Zhang. Blind separation of superimposed moving images using image statistics. TPAMI, 34(1):19–32, 2011.
[63] Y. Gandelsman, A. Shocher, and M. Irani. Double-DIP: Unsupervised image decomposition via coupled deep-image-priors. In CVPR, 2019.
[64] C. Gao, A. Saraf, J.-B. Huang, and J. Kopf. Flow-edge guided video completion. In ECCV, 2020.
[65] C. Gao, A. Saraf, J. Kopf, and J.-B. Huang. Dynamic view synthesis from dynamic monocular video. In ICCV, 2021.
[66] H. Gao, R. Li, S. Tulsiani, B. Russell, and A. Kanazawa. Monocular dynamic view synthesis: A reality check. In NIPS, 2022.
[67] V. Garcia and J. Bruna. Few-shot learning with graph neural networks. In ICLR, 2018.
[68] M. L. Gleicher and F. Liu. Re-cinematography: Improving the camerawork of casual video. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), 2008.
[69] C. Godard, O. Mac Aodha, and G. J. Brostow. Unsupervised monocular depth estimation with left-right consistency. In CVPR, 2017.
[70] C. Godard, O. Mac Aodha, M. Firman, and G. J. Brostow. Digging into selfsupervised monocular depth estimation. In ICCV, 2019.
[71] A. Goldstein and R. Fattal. Video stabilization using epipolar geometry. ACM TOG, 2012.
[72] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio. Generative adversarial nets. In NIPS, 2014.
[73] S. J. Gortler, R. Grzeszczuk, R. Szeliski, and M. F. Cohen. The lumigraph. In SIGGRAPH, 1996.
[74] M. D. Grossberg and S. K. Nayar. What is the space of camera response functions? In CVPR, 2003.
[75] R. Grosse, M. K. Johnson, E. H. Adelson, and W. T. Freeman. Ground truth dataset and baseline evaluations for intrinsic image algorithms. In ICCV, 2009.
[76] M. Grundmann, V. Kwatra, D. Castro, and I. Essa. Calibration-free rolling shutter removal. In ICCP, 2012.
[77] M. Grundmann, V. Kwatra, and I. Essa. Auto-directed video stabilization with robust L1 optimal camera paths. In CVPR, 2011.
[78] F. Güney and A. Geiger. Deep discrete flow. In ACCV, 2016.
[79] X. Guo, X. Cao, and Y. Ma. Robust separation of reflection from multiple images. In CVPR, 2014.
[80] M. Habermann, W. Xu, M. Zollhoefer, G. Pons-Moll, and C. Theobalt. Livecap: Real-time human performance capture from monocular video. ACM TOG, 2019.
[81] K. He, G. Gkioxari, P. Dollár, and R. Girshick. Mask r-cnn. In ICCV, 2017.
[82] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In CVPR, 2016.
[83] P. Hedman, J. Philip, T. Price, J.-M. Frahm, G. Drettakis, and G. Brostow. Deep blending for free-viewpoint image-based rendering. ACM TOG, 2018.
[84] S. Hochreiter, A. S. Younger, and P. R. Conwell. Learning to learn using gradient descent. In International Conference on Artificial Neural Networks, 2001.
[85] X. Hou and G. Qiu. Image companding and inverse halftoning using deep convolutional neural networks. arXiv, 2017.
[86] H.-P. Huang, H.-Y. Tseng, H.-Y. Lee, and J.-B. Huang. Semantic view synthesis. In ECCV, 2020.
[87] J.-B. Huang, S. B. Kang, N. Ahuja, and J. Kopf. Image completion using planar structure guidance. ACM TOG, 2014.
[88] J.-B. Huang, S. B. Kang, N. Ahuja, and J. Kopf. Temporally coherent completion of dynamic video. ACM TOG, 35(6):196, 2016.
[89] Q.-X. Huang and L. Guibas. Consistent shape maps via semidefinite programming. In Proceedings of the Eleventh Eurographics/ACMSIGGRAPH Symposium on Geometry Processing, 2013.
[90] Y. Huo, F. Yang, L. Dong, and V. Brost. Physiological inverse tone mapping based on retina response. The Visual Computer, 2014.
[91] S. Iizuka, E. Simo-Serra, and H. Ishikawa. Globally and locally consistent image completion. ACM TOG, 2017.
[92] S. Ilan and A. Shamir. A survey on data-driven video completion. Computer Graphics Forum, 34(6):60–85, 2015.
[93] M. Jaderberg, K. Simonyan, A. Zisserman, and K. Kavukcuoglu. Spatial transformer networks. In NIPS, 2015.
[94] J. Jeon, S. Cho, X. Tong, and S. Lee. Intrinsic image decomposition using structuretexture separation and surface normals. In ECCV, 2014.
[95] Y. Jeong, S. Ahn, C. Choy, A. Anandkumar, M. Cho, and J. Park. Self-calibrating neural radiance fields. In ICCV, 2021.
[96] H. Jiang, D. Sun, V. Jampani, M.-H. Yang, E. Learned-Miller, and J. Kautz. Super slomo: High quality estimation of multiple intermediate frames for video interpolation. In CVPR, 2018.
[97] M. Jin, S. Süsstrunk, and P. Favaro. Learning to see through reflections. In ICCP, 2018.
[98] J. Johnson, A. Alahi, and L. Fei-Fei. Perceptual losses for real-time style transfer and super-resolution. In ECCV, 2016.
[99] S. Jonna, K. K. Nakka, and R. R. Sahay. Deep learning based fence segmentation and removal from an image using a video sequence. In ECCV, 2016.
[100] S. Jonna, S. Satapathy, and R. R. Sahay. Stereo image de-fencing using smartphones. In ICASSP, 2017.
[101] J. T. Kajiya and B. P. Von Herzen. Ray tracing volume densities. ACM TOG, 1984.
[102] Z. Kalal, K. Mikolajczyk, and J. Matas. Tracking-learning-detection. TPAMI, 34(7):1409–1422, 2011.
[103] N. K. Kalantari and R. Ramamoorthi. Deep high dynamic range imaging of dynamic scenes. ACM TOG, 2017.
[104] N. Kanopoulos, N. Vasanthavada, and R. L. Baker. Design of an image edge detection filter using the Sobel operator. JSSC, 1988.
[105] H. C. Karaimer and M. S. Brown. A software platform for manipulating the camera imaging pipeline. In ECCV, 2016.
[106] E. A. Khan, A. O. Akyuz, and E. Reinhard. Ghost removal in high dynamic range images. In ICIP, 2006.
[107] S. J. Kim, H. T. Lin, Z. Lu, S. Süsstrunk, S. Lin, and M. S. Brown. A new in-camera imaging model for color computer vision and its application. TPAMI, 2012.
[108] D. P. Kingma and J. Ba. Adam: A method for stochastic optimization. In ICLR, 2014.
[109] A. Knapitsch, J. Park, Q.-Y. Zhou, and V. Koltun. Tanks and temples: Benchmarking large-scale scene reconstruction. ACM TOG, 2017.
[110] J. Kopf, M. F. Cohen, and R. Szeliski. First-person hyper-lapse videos. ACM TOG, 2014.
[111] J. Kopf, K. Matzen, S. Alsisan, O. Quigley, F. Ge, Y. Chong, J. Patterson, J.-M. Frahm, S. Wu, M. Yu, et al. One shot 3D photography. ACM TOG, 2020.
[112] J. Kopf, X. Rong, and J.-B. Huang. Robust consistent video depth estimation. In CVPR, 2021.
[113] R. P. Kovaleski and M. M. Oliveira. High-quality reverse tone mapping for a wide range of exposures. In 2014 27th SIBGRAPI Conference on Graphics, Patterns and Images, 2014.
[114] A. Krizhevsky, I. Sutskever, and G. E. Hinton. ImageNet classification with deep convolutional neural networks. In NIPS, 2012.
[115] S. Kumar, Y. Dai, and H. Li. Monocular dense 3d reconstruction of a complex dynamic scene from two perspective frames. In ICCV, 2017.
[116] W.-S. Lai, J.-B. Huang, Z. Hu, N. Ahuja, and M.-H. Yang. A comparative study for single image blind deblurring. In CVPR, 2016.
[117] K.-Y. Lee, Y.-Y. Chuang, B.-Y. Chen, and M. Ouhyoung. Video stabilization using robust feature trajectories. In ICCV, 2009.
[118] S. Lee, G. H. An, and S.-J. Kang. Deep chain hdri: Reconstructing a high dynamic range image from a single low dynamic range image. IEEE Access, 2018.
[119] S. Lee, G. H. An, and S.-J. Kang. Deep recursive hdri: Inverse tone mapping using generative adversarial networks. In ECCV, 2018.
[120] S. Lee, J. Lee, B. Kim, K. Kim, and J. Noh. Video extrapolation using neighboring frames. ACM TOG, 2019.
[121] M. Levoy and P. Hanrahan. Light field rendering. In SIGGRAPH, 1996.
[122] H. Li and P. Peers. Crf-net: Single image radiometric calibration using CNNs. In European Conference on Visual Media Production, 2017.
[123] Y. Li and M. S. Brown. Exploiting reflection change for automatic reflection removal. In ICCV, 2013.
[124] Y. Li and M. S. Brown. Single image layer separation using relative smoothness. In CVPR, 2014.
[125] Z. Li, T. Dekel, F. Cole, R. Tucker, N. Snavely, C. Liu, and W. T. Freeman. Learning the depths of moving people by watching frozen people. In CVPR, 2019.
[126] Z. Li, S. Niklaus, N. Snavely, and O. Wang. Neural scene flow fields for space-time view synthesis of dynamic scenes. In CVPR, 2021.
[127] Z. Li, M. Shafiei, R. Ramamoorthi, K. Sunkavalli, and M. Chandraker. Inverse rendering for complex indoor scenes: Shape, spatially-varying lighting and svbrdf from a single image. In CVPR, 2020.
[128] X. Liang, L. Lee, W. Dai, and E. P. Xing. Dual motion GAN for future-flow embedded video prediction. In ICCV, 2017.
[129] Z. Liang, J. Xu, D. Zhang, Z. Cao, and L. Zhang. A hybrid l1-l0 layer decomposition model for tone mapping. In CVPR, 2018.
[130] C.-H. Lin, W.-C. Ma, A. Torralba, and S. Lucey. Barf: Bundle-adjusting neural radiance fields. In ICCV, 2021.
[131] K.-E. Lin, L. Xiao, F. Liu, G. Yang, and R. Ramamoorthi. Deep 3d mask volume for view synthesis of dynamic scenes. In ICCV, 2021.
[132] S. Lin, J. Gu, S. Yamazaki, and H.-Y. Shum. Radiometric calibration from a single image. In CVPR, 2004.
[133] S. Lin and L. Zhang. Determining the radiometric response function from a single grayscale image. In CVPR, 2005.
[134] C. Liu, X. Wu, and X. Shu. Learning-based dequantization for image restoration against extremely poor illumination. arXiv, 2018.
[135] C. Liu, J. Yuen, A. Torralba, J. Sivic, and W. T. Freeman. Sift flow: Dense correspondence across different scenes. In ECCV, 2008.
[136] F. Liu, M. Gleicher, H. Jin, and A. Agarwala. Content-preserving warps for 3D video stabilization. ACM TOG, 2009.
[137] F. Liu, M. Gleicher, J. Wang, H. Jin, and A. Agarwala. Subspace video stabilization. ACM TOG, 2011.
[138] G. Liu, F. A. Reda, K. J. Shih, T.-C. Wang, A. Tao, and B. Catanzaro. Image inpainting for irregular holes using partial convolutions. In ECCV, 2018.
[139] S. Liu, Y. Wang, L. Yuan, J. Bu, P. Tan, and J. Sun. Video stabilization with a depth camera. In CVPR, 2012.
[140] S. Liu, L. Yuan, P. Tan, and J. Sun. Bundled camera paths for video stabilization. ACM TOG, 2013.
[141] S. Liu, L. Yuan, P. Tan, and J. Sun. Steadyflow: Spatially smooth optical flow for video stabilization. In CVPR, 2014.
[142] Y.-L. Liu, W.-S. Lai, Y.-S. Chen, Y.-L. Kao, M.-H. Yang, Y.-Y. Chuang, and J.-B. Huang. Single-image hdr reconstruction by learning to reverse the camera pipeline. In CVPR, 2020.
[143] Y.-L. Liu, W.-S. Lai, M.-H. Yang, Y.-Y. Chuang, and J.-B. Huang. Learning to see through obstructions. In CVPR, 2020.
[144] Y.-L. Liu, W.-S. Lai, M.-H. Yang, Y.-Y. Chuang, and J.-B. Huang. Hybrid neural fusion for full-frame video stabilization. In ICCV, 2021.
[145] Y.-L. Liu, W.-S. Lai, M.-H. Yang, Y.-Y. Chuang, and J.-B. Huang. Learning to see through obstructions with layered decomposition. TPAMI, 2021.
[146] Y.-L. Liu, Y.-T. Liao, Y.-Y. Lin, and Y.-Y. Chuang. Deep video frame interpolation using cyclic frame generation. In AAAI, 2019.
[147] Z. Liu, R. Yeh, X. Tang, Y. Liu, and A. Agarwala. Video frame synthesis using deep voxel flow. In ICCV, 2017.
[148] S. Lombardi, T. Simon, J. Saragih, G. Schwartz, A. Lehrmann, and Y. Sheikh. Neural volumes: Learning dynamic renderable volumes from images. ACM TOG, 2019.
[149] G. Long, L. Kneip, J. M. Alvarez, H. Li, X. Zhang, and Q. Yu. Learning image matching by simply watching video. In ECCV, 2016.
[150] X. Luo, J.-B. Huang, R. Szeliski, K. Matzen, and J. Kopf. Consistent video depth estimation. ACM TOG, 39(4), 2020.
[151] S. Mangiat and J. Gibson. High dynamic range video with ghost removal. In Applications of Digital Image Processing XXXIII, 2010.
[152] S. Mann and R. W. Picard. On being ‘undigital’ with digital cameras: Extending dynamic range by combining differently exposed pictures. In Proceedings of IS&T, 1995.
[153] R. Mantiuk, K. J. Kim, A. G. Rempel, and W. Heidrich. HDR-VDP-2: A calibrated visual metric for visibility and quality predictions in all luminance conditions. ACM TOG, 2011.
[154] D. Marnerides, T. Bashford-Rogers, J. Hatchett, and K. Debattista. ExpandNet: A deep convolutional neural network for high dynamic range expansion from low dynamic range content. In EG, 2018.
[155] D. Marr and E. Hildreth. Theory of edge detection. Proc. R. Soc. Lond. B, 1980.
[156] R. Martin-Brualla, N. Radwan, M. S. Sajjadi, J. T. Barron, A. Dosovitskiy, and D. Duckworth. Nerf in the wild: Neural radiance fields for unconstrained photo collections. In CVPR, 2021.
[157] Y. Matsushita and S. Lin. Radiometric calibration from noise distributions. In CVPR, 2007.
[158] Y. Matsushita, E. Ofek, X. Tang, and H.-Y. Shum. Full-frame video stabilization. In CVPR, 2005.
[159] B. Mildenhall, P. P. Srinivasan, R. Ortiz-Cayon, N. K. Kalantari, R. Ramamoorthi, R. Ng, and A. Kar. Local light field fusion: Practical view synthesis with prescriptive sampling guidelines. ACM TOG, 2019.
[160] B. Mildenhall, P. P. Srinivasan, M. Tancik, J. T. Barron, R. Ramamoorthi, and R. Ng. NeRF: Representing scenes as neural radiance fields for view synthesis. In ECCV, 2020.
[161] Y. Mu, W. Liu, and S. Yan. Video de-fencing. IEEE Transactions on Circuits and Systems for Video Technology, 24(7):1111–1121, 2013.
[162] T. Müller, A. Evans, C. Schied, and A. Keller. Instant neural graphics primitives with a multiresolution hash encoding. ACM TOG, 2022.
[163] R. Mur-Artal, J. M. M. Montiel, and J. D. Tardos. Orb-slam: a versatile and accurate monocular slam system. IEEE transactions on robotics, 2015.
[164] R. Mur-Artal and J. D. Tardós. Orb-slam2: An open-source slam system for monocular, stereo, and rgb-d cameras. IEEE transactions on robotics, 2017.
[165] A. Nandoriya, M. Elgharib, C. Kim, M. Hefeeda, and W. Matusik. Video reflection removal through spatio-temporal optimization. In ICCV, 2017.
[166] H. Nemoto, P. Korshunov, P. Hanhart, and T. Ebrahimi. Visual attention in ldr and hdr images. In 9th International Workshop on Video Processing and Quality Metrics for Consumer Electronics (VPQM), 2015.
[167] T. Nestmeyer, J.-F. Lalonde, I. Matthews, and A. Lehrmann. Learning physicsguided face relighting under directional light. In CVPR, 2020.
[168] R. A. Newcombe, S. J. Lovegrove, and A. J. Davison. Dtam: Dense tracking and mapping in real-time. In ICCV, 2011.
[169] T.-T. Ng, S.-F. Chang, and M.-P. Tsui. Using geometry invariants for camera response function estimation. In CVPR, 2007.
[170] A. Nichol, J. Achiam, and J. Schulman. On first-order meta-learning algorithms. arXiv preprint arXiv:1803.02999, 2018.
[171] A. Nichol and J. Schulman. Reptile: a scalable metalearning algorithm. arXiv preprint arXiv:1803.02999, 2(3):4, 2018.
[172] S. Niklaus and F. Liu. Context-aware synthesis for video frame interpolation. In CVPR, 2018.
[173] S. Niklaus, L. Mai, and F. Liu. Video frame interpolation via adaptive convolution. In CVPR, 2017.
[174] S. Niklaus, L. Mai, and F. Liu. Video frame interpolation via adaptive separable convolution. In ICCV, 2017.
[175] S. Niklaus, L. Mai, J. Yang, and F. Liu. 3D Ken Burns effect from a single image. ACM TOG, 2019.
[176] A. Odena, V. Dumoulin, and C. Olah. Deconvolution and checkerboard artifacts. Distill, 2016.
[177] S. Orts-Escolano, C. Rhemann, S. Fanello, W. Chang, A. Kowdle, Y. Degtyarev, D. Kim, P. L. Davidson, S. Khamis, M. Dou, et al. Holoportation: Virtual 3d teleportation in real-time. In Proceedings of the 29th annual symposium on user interface software and technology, pages 741–754, 2016.
[178] H. S. Park, T. Shiratori, I. Matthews, and Y. Sheikh. 3d reconstruction of a moving point from a series of 2d projections. In ECCV, 2010.
[179] K. Park, U. Sinha, J. T. Barron, S. Bouaziz, D. B. Goldman, S. M. Seitz, and R. Martin-Brualla. Nerfies: Deformable neural radiance fields. In CVPR, 2021.
[180] K. Park, U. Sinha, P. Hedman, J. T. Barron, S. Bouaziz, D. B. Goldman, R. MartinBrualla, and S. M. Seitz. Hypernerf: A higher-dimensional representation for topologically varying neural radiance fields. ACM TOG, 2021.
[181] M. Park, K. Brocklehurst, R. T. Collins, and Y. Liu. Image de-fencing revisited. In ACCV, 2010.
[182] D. Pathak, P. Krahenbuhl, J. Donahue, T. Darrell, and A. A. Efros. Context encoders: Feature learning by inpainting. In CVPR, 2016.
[183] E. Penner and L. Zhang. Soft 3D reconstruction for view synthesis. ACM TOG, 2017.
[184] F. Perazzi, J. Pont-Tuset, B. McWilliams, L. Van Gool, M. Gross, and A. SorkineHornung. A benchmark dataset and evaluation methodology for video object segmentation. In CVPR, 2016.
[185] Photomatix. https://www.hdrsoft.com/.
[186] A. Pumarola, E. Corona, G. Pons-Moll, and F. Moreno-Noguer. D-nerf: Neural radiance fields for dynamic scenes. In CVPR, 2021.
[187] A. Punnappurath and M. S. Brown. Reflection removal using a dual-pixel sensor. In CVPR, 2019.
[188] R. Qian, R. T. Tan, W. Yang, J. Su, and J. Liu. Attentive generative adversarial network for raindrop removal from a single image. In CVPR, 2018.
[189] R. Ranftl, K. Lasinger, D. Hafner, K. Schindler, and V. Koltun. Towards robust monocular depth estimation: Mixing datasets for zero-shot cross-dataset transfer. IEEE TPAMI, 2020.
[190] S. Ravi and H. Larochelle. Optimization as a model for few-shot learning. In ICLR, 2017.
[191] G. Riegler and V. Koltun. Free view synthesis. In ECCV, 2020.
[192] G. Riegler and V. Koltun. Stable view synthesis. In CVPR, 2021.
[193] M. Rubinstein, D. Gutierrez, O. Sorkine, and A. Shamir. A comparative study of image retargeting. ACM TOG, 2010.
[194] C. Russell, R. Yu, and L. Agapito. Video pop-up: Monocular 3d reconstruction of dynamic scenes. In ECCV, 2014.
[195] S. Saito, Z. Huang, R. Natsume, S. Morishima, A. Kanazawa, and H. Li. PIFu: Pixel-aligned implicit function for high-resolution clothed human digitization. In ICCV, 2019.
[196] J. L. Schönberger and J.-M. Frahm. Structure-from-motion revisited. In CVPR, 2016.
[197] S. M. Seitz and C. R. Dyer. View morphing. In SIGGRAPH, 1996.
[198] S. Sengupta, J. Gu, K. Kim, G. Liu, D. W. Jacobs, and J. Kautz. Neural inverse rendering of an indoor scene from a single image. In ICCV, 2019.
[199] M.-L. Shih, S.-Y. Su, J. Kopf, and J.-B. Huang. 3D photography using contextaware layered depth inpainting. In CVPR, 2020.
[200] Y. Shih, D. Krishnan, F. Durand, and W. T. Freeman. Reflection removal using ghosting cues. In CVPR, 2015.
[201] K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. In ICLR, 2015.
[202] S. Sinha, R. Shapovalov, J. Reizenstein, I. Rocco, N. Neverova, A. Vedaldi, and D. Novotny. Common pets in 3d: Dynamic new-view synthesis of real-life deformable categories. arXiv preprint arXiv:2211.03889, 2022.
[203] S. N. Sinha, J. Kopf, M. Goesele, D. Scharstein, and R. Szeliski. Image-based rendering for scenes with reflections. ACM TOG, 31(4):100–1, 2012.
[204] V. Sitzmann, J. Thies, F. Heide, M. Nießner, G. Wetzstein, and M. Zollhofer. DeepVoxels: Learning persistent 3D feature embeddings. In CVPR, 2019.
[205] V. Sitzmann, M. Zollhöfer, and G. Wetzstein. Scene representation networks: Continuous 3D-structure-aware neural scene representations. In NIPS, 2019.
[206] B. M. Smith, L. Zhang, H. Jin, and A. Agarwala. Light field video stabilization. In ICCV, 2009.
[207] J. Snell, K. Swersky, and R. Zemel. Prototypical networks for few-shot learning. In NIPS, 2017.
[208] Q. Song, G.-M. Su, and P. C. Cosman. Hardware-efficient debanding and visual enhancement filter for inverse tone mapped high dynamic range images and videos. In ICIP, 2016.
[209] K. Soomro, A. R. Zamir, and M. Shah. UCF101: A dataset of 101 human actions classes from videos in the wild. Technical Report CRCV-TR-12-01, University of Central Florida, 2012.
[210] A. Srikantha and D. Sidibé. Ghost detection and removal for high dynamic range images: Recent advances. Signal Processing: Image Communication, 2012.
[211] P. P. Srinivasan, R. Tucker, J. T. Barron, R. Ramamoorthi, R. Ng, and N. Snavely. Pushing the boundaries of view extrapolation with multiplane images. In CVPR, 2019.
[212] S. Su, M. Delbracio, J. Wang, G. Sapiro, W. Heidrich, and O. Wang. Deep video deblurring for hand-held cameras. In CVPR, 2017.
[213] C. Sun, M. Sun, and H.-T. Chen. Direct voxel grid optimization: Super-fast convergence for radiance fields reconstruction. In CVPR, 2022.
[214] D. Sun, X. Yang, M.-Y. Liu, and J. Kautz. Pwc-net: Cnns for optical flow using pyramid, warping, and cost volume. In CVPR, 2018.
[215] S.-H. Sun, M. Huh, Y.-H. Liao, N. Zhang, and J. J. Lim. Multi-view to novel view: Synthesizing novel views with self-learned confidence. In ECCV, 2018.
[216] Y. Sun, X. Wang, Z. Liu, J. Miller, A. A. Efros, and M. Hardt. Test-time training for out-of-distribution generalization. In ICML, 2020.
[217] N. Sundaram, T. Brox, and K. Keutzer. Dense point trajectories by GPU-accelerated large displacement optical flow. In ECCV, 2010.
[218] R. Szeliski. Image alignment and stitching: A tutorial. Foundations and Trends in Computer Graphics and Vision, 2(1):1–104, 2006.
[219] R. Szeliski, S. Avidan, and P. Anandan. Layer extraction from multiple images containing reflections and transparency. In CVPR, 2000.
[220] J. Takamatsu, Y. Matsushita, and K. Ikeuchi. Estimating radiometric response functions from image noise variance. In ECCV, 2008.
[221] Z. Teed and J. Deng. RAFT: Recurrent all-pairs field transforms for optical flow. In ECCV, 2020.
[222] Z. Teed and J. Deng. Droid-slam: Deep visual slam for monocular, stereo, and rgb-d cameras. In NIPS, 2021.
[223] D. Teney and M. Hebert. Learning to extract motion from videos in convolutional neural networks. In ACCV, 2016.
[224] J. Thies, M. Zollhöfer, and M. Nießner. Deferred neural rendering: Image synthesis using neural textures. ACM TOG, 2019.
[225 video. In ICCV, 2021.
[227] H.-Y. Tseng, H.-Y. Lee, J.-B. Huang, and M.-H. Yang. Cross-domain few-shot classification via learned feature-wise transformation. In ICLR, 2020.
[228. Vinyals, C. Blundell, T. Lillicrap, D. Wierstra, et al. Matching networks for one shot learning. In NIPS, 2016.
[231] C. Vondrick, H. Pirsiavash, and A. Torralba. Generating videos with scene dynamics. In NIPS, 2016.
[232] R. Wan, B. Shi, L.-Y. Duan, A.-H. Tan, and A. C. Kot. Benchmarking single-image reflection removal algorithms. In ICCV, 2017.
[233] R. Wan, B. Shi, T. A. Hwee, and A. C. Kot. Depth of field guided reflection removal. In ICIP, 2016.
[234] F. Wang, Q. Huang, and L. J. Guibas. Image co-segmentation via consistent functional maps. In ICCV, 2013.
[235] M. Wang, G.-Y. Yang, J.-K. Lin, S.-H. Zhang, A. Shamir, S.-P. Lu, and S.-M. Hu. Deep online video stabilization with multi-grid warping transformation learning. TIP, 2018.
[236] Y.-S. Wang, F. Liu, P.-S. Hsu, and T.-Y. Lee. Spatially and temporally optimized video stabilization. IEEE Transactions on Visualization and Computer Graphics, 2013.
[237] Z. Wang, A. C. Bovik, H. R. Sheikh, E. P. Simoncelli, et al. Image quality assessment: from error visibility to structural similarity. TIP, 13(4):600–612, 2004.
[238] Z. Wang, S. Wu, W. Xie, M. Chen, and V. A. Prisacariu. Nerf–: Neural radiance fields without known camera parameters. arXiv preprint arXiv:2102.07064, 2021.
[239] K. Wei, J. Yang, Y. Fu, D. Wipf, and H. Huang. Single image reflection removal exploiting misaligned training data and network enhancements. In CVPR, 2019.
[240] P. Weinzaepfel, J. Revaud, Z. Harchaoui, and C. Schmid. DeepFlow: Large displacement optical flow with deep matching. In ICCV, 2013.
[241] C.-Y. Weng, B. Curless, P. P. Srinivasan, J. T. Barron, and I. KemelmacherShlizerman. Humannerf: Free-viewpoint rendering of moving people from monocular video. In CVPR, 2022.
[242] M. Werlberger, T. Pock, M. Unger, and H. Bischof. Optical flow guided TV-L1 video interpolation and restoration. In IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2011.
[243] O. Wiles, G. Gkioxari, R. Szeliski, and J. Johnson. SynSin: End-to-end view synthesis from a single image. In CVPR, 2020.
[244] S. Wu, C. Rupprecht, and A. Vedaldi. Unsupervised learning of probably symmetric deformable 3D objects from images in the wild. In CVPR, 2020.
[245] S. Wu, J. Xu, Y.-W. Tai, and C.-K. Tang. Deep high dynamic range imaging with large foreground motions. In ECCV, 2018.
[246] W. Xian, J.-B. Huang, J. Kopf, and C. Kim. Space-time neural irradiance fields for free-viewpoint video. In CVPR, 2021.
[247] S. Xie and Z. Tu. Holistically-nested edge detection. In ICCV, 2015.
[248] L. Xu, J. Jia, and Y. Matsushita. Motion detail preserving optical flow estimation. TPAMI, 34(9):1744–1757, 2012.
[249] N. Xu, B. Price, S. Cohen, and T. Huang. Deep image matting. In CVPR, 2017.
[250] R. Xu, X. Li, B. Zhou, and C. C. Loy. Deep flow-guided video inpainting. In CVPR, 2019.
[251] S.-Z. Xu, J. Hu, M. Wang, T.-J. Mu, and S.-M. Hu. Deep video stabilization using adversarial networks. In Computer Graphics Forum, 2018.
[252] T. Xue, B. Chen, J. Wu, D. Wei, and W. T. Freeman. Video enhancement with task-oriented flow. IJCV, 127(8):1106–1125, 2019.
[253] T. Xue, M. Rubinstein, C. Liu, and W. T. Freeman. A computational approach for obstruction-free photography. ACM TOG, 34(4):79, 2015.
[254] T. Xue, J. Wu, K. Bouman, and B. Freeman. Visual dynamics: Probabilistic future frame synthesis via cross convolutional networks. In NIPS, 2016.
[255] J. Yang, D. Gong, L. Liu, and Q. Shi. Seeing deeply and bidirectionally: A deep learning approach for single image reflection removal. In ECCV, 2018.
[256] J. Yang, H. Li, Y. Dai, and R. T. Tan. Robust optical flow estimation of double-layer images under transparency or reflection. In CVPR, 2016.
[257] X. Yang, K. Xu, Y. Song, Q. Zhang, X. Wei, and L. Rynson. Image correction via deep reciprocating hdr transformation. In CVPR, 2018.
[258] R. Yi, J. Wang, and P. Tan. Automatic fence segmentation in videos of dynamic scenes. In CVPR, 2016.
[259] Z. Yin and J. Shi. Geonet: Unsupervised learning of dense depth, optical flow and camera pose. In CVPR, 2018.
[260] J. S. Yoon, K. Kim, O. Gallo, H. S. Park, and J. Kautz. Novel view synthesis of dynamic scenes with globally coherent depths from a monocular camera. In CVPR, 2020.
[261] J. Yu, Z. Lin, J. Yang, X. Shen, X. Lu, and T. S. Huang. Generative image inpainting with contextual attention. In CVPR, 2018.
[262] J. Yu, Z. Lin, J. Yang, X. Shen, X. Lu, and T. S. Huang. Free-form image inpainting with gated convolution. ICCV, 2019.
[263] J. Yu and R. Ramamoorthi. Selfie video stabilization. In ECCV, 2018.
[264] J. Yu and R. Ramamoorthi. Robust video stabilization by optimization in cnn weight space. In CVPR, 2019.
[265] J. Yu and R. Ramamoorthi. Learning video stabilization using optical flow. In CVPR, 2020.
[266] Z. Yu, H. Li, Z. Wang, Z. Hu, and C. W. Chen. Multi-level video frame interpolation: Exploiting the interaction among different levels. TCSVT, 2013.
[267] C. Zach, M. Klopschitz, and M. Pollefeys. Disambiguating visual relations using loop constraints. In CVPR, 2010.
[268] J. Zhang and J.-F. Lalonde. Learning high dynamic range from outdoor panoramas. In ICCV, 2017.
[269] K. Zhang, G. Riegler, N. Snavely, and V. Koltun. Nerf++: Analyzing and improving neural radiance fields. arXiv preprint arXiv:2010.07492, 2020.
[270] R. Zhang, P. Isola, A. A. Efros, E. Shechtman, and O. Wang. The unreasonable effectiveness of deep features as a perceptual metric. In CVPR, 2018.
[271] X. Zhang, R. Ng, and Q. Chen. Single image reflection separation with perceptual losses. In CVPR, 2018.
[272] W. Zhao, S. Liu, H. Guo, W. Wang, and Y.-J. Liu. Particlesfm: Exploiting dense point trajectories for localizing moving cameras in the wild. In ECCV, 2022.
[273] Y. Zhao, R. Wang, W. Jia, W. Zuo, X. Liu, and W. Gao. Deep reconstruction of least significant bits for bit-depth expansion. TIP, 2019.
[274] T. Zhou, M. Brown, N. Snavely, and D. G. Lowe. Unsupervised learning of depth and ego-motion from video. In CVPR, 2017.
[275] T. Zhou, Y. Jae Lee, S. X. Yu, and A. A. Efros. FlowWeb: Joint image set alignment by weaving consistent, pixel-wise correspondences. In CVPR, 2015.
[276] T. Zhou, P. Krahenbuhl, M. Aubry, Q. Huang, and A. A. Efros. Learning dense correspondence via 3D-guided cycle consistency. In CVPR, 2016.
[277] T. Zhou, P. Krahenbuhl, and A. A. Efros. Learning data-driven reflectance priors for intrinsic image decomposition. In ICCV, 2015.
[278] T. Zhou, R. Tucker, J. Flynn, G. Fyffe, and N. Snavely. Stereo magnification: Learning view synthesis using multiplane images. ACM TOG, 2018.
[279] Z. Zhou, H. Jin, and Y. Ma. Plane-based content preserving warps for video stabilization. In CVPR, 2013.
[280] J.-Y. Zhu, T. Park, P. Isola, and A. A. Efros. Unpaired image-to-image translation using cycle-consistent adversarial networks. In ICCV, 2017.
[281] L. Zintgraf, K. Shiarli, V. Kurin, K. Hofmann, and S. Whiteson. Fast context adaptation via meta-learning. In ICML, 2019.
[282] C. L. Zitnick, S. B. Kang, M. Uyttendaele, S. Winder, and R. Szeliski. High-quality video view interpolation using a layered representation. ACM TOG, 2004.
-
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/83080-
dc.description.abstract我們生活在一個豐富多彩、充滿活力和連續的世界中,人類視覺系統可以捕捉光線並將其傳遞給大腦進行進一步分析。十多年來,人們一直試圖使用數位相機來模仿人類的視覺系統並捕捉我們所看到的場景。然而,因為數位相機的不完美,充分獲取信息並反映人眼的感知具有挑戰性。因此,本論文的中心主題是彌合不完善的相機系統與人類視覺感知之間的差距。本論文的重點是圖像恢復和視頻增強,其中不完美的拍攝過程會破壞輸入圖像/視頻。我們嘗試使用基於深度學習的方法來解決以下任務:(1)單圖像 HDR 重建,(2)視頻幀插值,(3)多幀障礙消除,(4)視頻穩定,以及(5)空間-時間視圖合成。與將網絡視為黑盒並學習輸入-輸出映射函數的標準基於學習的方法不同,我們將物理約束或先驗項作為正則化來提高輸出質量和泛化能力。zh_TW
dc.description.abstractWe live in a colorful, dynamic, and continuous world. The human visual system can capture light and pass it to the brain for further analysis. People have tried to use the digital camera to mimic the human visual system and capture the scenes we saw for over a decade. However, with the imperfection of the digital camera, it is challenging to fully acquire the information and reflect the perception of the human eyes. Therefore, this dissertation's central theme is to bridge the gap between imperfect camera systems and human visual perception. This dissertation focuses on image restoration and video enhancement, where the imperfect capturing process corrupts input images/videos. We try to tackle the following tasks with learning-based approaches: (1) single-image HDR reconstruction, (2) video frame interpolation, (3) multi-frame obstruction removal, (4) video stabilization, and (5) space-time view synthesis. Unlike the standard learning-based approaches that treat the network as a black box and learn an input-output mapping function, we integrate physical constraints or prior terms as regularization to improve the output quality and the generalization ability.en
dc.description.provenanceSubmitted by admin ntu (admin@lib.ntu.edu.tw) on 2023-01-06T17:05:23Z
No. of bitstreams: 0
en
dc.description.provenanceMade available in DSpace on 2023-01-06T17:05:23Z (GMT). No. of bitstreams: 0en
dc.description.tableofcontentsAcknowledgements i
摘要 iii
Abstract v
Contents vii
List of Figures xi
List of Tables xix
Chapter 1 Introduction 1
Chapter 2 Single-image HDR Reconstruction 5
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.3 Learning to Reverse the Camera Pipeline . . . . . . . . . . . . . . . 9
2.3.1 LDR Image Formation . . . . . . . . . . . . . . . . . . . . . . . . 11
2.3.2 Dequantization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.3.3 Linearization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.3.4 Hallucination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.3.5 Joint Training . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.3.6 Refinement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.4.1 Experiment Setups . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.4.2 Comparisons with State-of-the-art Methods . . . . . . . . . . . . . 19
2.4.3 Ablation Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Chapter 3 Video Frame Interpolation 23
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.3 Deep Video Frame Interpolation using Cyclic Frame Generation . . . 27
3.3.1 Cycle Consistency Loss Lc . . . . . . . . . . . . . . . . . . . . . . 28
3.3.2 Motion Linearity Loss Lm . . . . . . . . . . . . . . . . . . . . . . 30
3.3.3 Edge-guided Training E . . . . . . . . . . . . . . . . . . . . . . . . 31
3.3.4 Implementation Details . . . . . . . . . . . . . . . . . . . . . . . . 32
3.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.4.1 Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.4.2 Ablation Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.4.3 Comparisons with State-of-the-Art Methods . . . . . . . . . . . . . 37
3.4.4 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
Chapter 4 Multi-frame Obstruction Removal 41
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.3 Learning to See Through Obstructions with Layered Decomposition . 46
4.3.1 Algorithmic Overview . . . . . . . . . . . . . . . . . . . . . . . . 47
4.3.2 Initial Flow Decomposition . . . . . . . . . . . . . . . . . . . . . . 47
4.3.3 Background/Reflection Layer Reconstruction . . . . . . . . . . . . 48
4.3.4 Optical Flow Refinement . . . . . . . . . . . . . . . . . . . . . . . 50
4.3.5 Network training . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.3.6 Meta-learning for Fast Adaptation . . . . . . . . . . . . . . . . . . 52
4.3.7 Online Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.3.8 Extension to other obstruction removal tasks . . . . . . . . . . . . . 54
4.3.9 Synthetic Sequence Generation . . . . . . . . . . . . . . . . . . . . 55
4.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
4.4.1 Comparisons with State-of-the-arts . . . . . . . . . . . . . . . . . . 57
4.4.2 Analysis and Discussions . . . . . . . . . . . . . . . . . . . . . . . 64
4.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
Chapter 5 Video Stabilization 75
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
5.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
5.3 Hybrid Neural Fusion for Full-frame Video Stabilization . . . . . . . 80
5.3.1 Pre-processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
5.3.2 Warping and Fusion . . . . . . . . . . . . . . . . . . . . . . . . . . 82
5.3.3 Path Adjustment . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
5.3.4 Training Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
5.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
5.4.1 Ablation Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
5.4.2 Quantitative Evaluation . . . . . . . . . . . . . . . . . . . . . . . . 87
5.4.3 Visual Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . 89
5.4.4 View Synthesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
5.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
Chapter 6 Robust Dynamic Radiance Fields 93
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
6.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
6.3 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
6.3.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
6.3.2 Method Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
6.3.3 Camera Pose Estimation . . . . . . . . . . . . . . . . . . . . . . . 100
6.3.4 Dynamic Radiance Field Reconstruction . . . . . . . . . . . . . . . 103
6.3.5 Implementation Details . . . . . . . . . . . . . . . . . . . . . . . . 107
6.3.6 Evaluation on Camera Poses Estimation . . . . . . . . . . . . . . . 107
6.3.7 Evaluation on Dynamic View Synthesis . . . . . . . . . . . . . . . 108
6.3.8 Ablation Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
6.3.9 Failure Cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
6.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
References 111
-
dc.language.isoen-
dc.title通過結合建模和學習來恢復和增強圖像和視頻zh_TW
dc.titleRestoring and Enhancing Images and Videos by Combining Modeling and Learningen
dc.title.alternativeRestoring and Enhancing Images and Videos by Combining Modeling and Learning-
dc.typeThesis-
dc.date.schoolyear111-1-
dc.description.degree博士-
dc.contributor.oralexamcommittee陳祝嵩;王鈺強;陳宏銘;楊明玄;黃嘉斌;林彥宇;廖弘源;賴尚宏zh_TW
dc.contributor.oralexamcommitteeChu-Song Chen;Yu-Chiang Frank Wang;Homer H. Chen;Ming-Hsuan Yang;Jia-Bin Huang;Yen-Yu Lin;Mark Liao;Shang-Hong Laien
dc.subject.keyword圖像恢復,光流估計,多幀融合,計算攝影,卷積神經網絡,基於單圖像的 HDR 重建,視頻幀插值,反射去除,視頻穩定,新視角合成,zh_TW
dc.subject.keywordImage-restoration,Optical flow estimation,Multi-frame fusion,Computational photography,Convolutional neural networks,Single-image-based HDR reconstruction,Video frame interpolation,Reflection removal,Video stabilization,Novel view synthesis,en
dc.relation.page132-
dc.identifier.doi10.6342/NTU202210149-
dc.rights.note同意授權(全球公開)-
dc.date.accepted2022-12-22-
dc.contributor.author-college電機資訊學院-
dc.contributor.author-dept資訊工程學系-
顯示於系所單位:資訊工程學系

文件中的檔案:
檔案 大小格式 
U0001-1116221219382014.pdf53.21 MBAdobe PDF檢視/開啟
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved