請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/84867完整後設資料紀錄
| DC 欄位 | 值 | 語言 |
|---|---|---|
| dc.contributor.advisor | 洪一平(Yi-Ping Hung) | |
| dc.contributor.author | Chien-Cheng Chen | en |
| dc.contributor.author | 陳建丞 | zh_TW |
| dc.date.accessioned | 2023-03-19T22:29:52Z | - |
| dc.date.copyright | 2022-09-12 | |
| dc.date.issued | 2022 | |
| dc.date.submitted | 2022-08-29 | |
| dc.identifier.citation | [1] M. K. Ali, S. Yu, and T. H. Kim. Deep motion blind video stabilization. arXiv preprint arXiv:2011.09697, 2020. [2] J. Bian, Z. Li, N. Wang, H. Zhan, C. Shen, M.-M. Cheng, and I. Reid. Unsupervised scale-consistent depth and ego-motion learning from monocular video. NeurIPS, 32, 2019. [3] C. Buehler, M. Bosse, and L. McMillan. Non-metric image-based rendering for video stabilization. In CVPR, volume 2, pages II–II, 2001. [4] Y.-T. Chen, K.-W. Tseng, Y.-C. Lee, C.-Y. Chen, and Y.-P. Hung. Pixstabnet: Fast multi-scale deep online video stabilization with pixel-based warping. In ICIP, pages 1929–1933, 2021. [5] J. Choi and I. S. Kweon. Deep iterative frame interpolation for full-frame video stabilization. ACM TOG, 39(1):1–9, 2020. [6] C. Godard, O. Mac Aodha, M. Firman, and G. J. Brostow. Digging into selfsupervised monocular depth estimation. In ICCV, pages 3828–3838, 2019. [7] A. Goldstein and R. Fattal. Video stabilization using epipolar geometry. ACM TOG, 31(5):1–10, 2012. [8] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio. Generative adversarial nets. NeurIPS, 27, 2014. [9] M. Grundmann, V. Kwatra, and I. Essa. Auto-directed video stabilization with robust l1 optimal camera paths. In CVPR, pages 225–232, 2011. [10] R. Hartley and A. Zisserman. Multiple view geometry in computer vision. Cambridge university press, 2003. [11] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In CVPR, pages 770–778, 2016. [12] J. Johnson, A. Alahi, and L. Fei-Fei. Perceptual losses for real-time style transfer and super-resolution. In ECCV, pages 694–711, 2016. [13] A. Karpenko, D. Jacobs, J. Baek, and M. Levoy. Digital video stabilization and rolling shutter correction using gyroscopes. CSTR, 1(2):13, 2011. [14] K.-Y. Lee, Y.-Y. Chuang, B.-Y. Chen, and M. Ouhyoung. Video stabilization using robust feature trajectories. In ICCV, pages 1397–1404, 2009. [15] Y.-C. Lee, K.-W. Tseng, Y.-T. Chen, C.-C. Chen, C.-S. Chen, and Y.-P. Hung. 3d video stabilization with depth estimation by cnn-based optimization. In CVPR, pages 10621–10630, 2021. [16] F. Liu, M. Gleicher, H. Jin, and A. Agarwala. Content-preserving warps for 3d video stabilization. ACM TOG, 28(3):1–9, 2009. [17] F. Liu, M. Gleicher, J. Wang, H. Jin, and A. Agarwala. Subspace video stabilization. ACM TOG, 30(1):1–10, 2011. [18] S. Liu, P. Tan, L. Yuan, J. Sun, and B. Zeng. Meshflow: Minimum latency online video stabilization. In ECCV, pages 800–815, 2016. [19] S. Liu, Y. Wang, L. Yuan, J. Bu, P. Tan, and J. Sun. Video stabilization with a depth camera. In CVPR, pages 89–95, 2012. [20] S. Liu, L. Yuan, P. Tan, and J. Sun. Bundled camera paths for video stabilization. ACM TOG, 32(4):1–10, 2013. [21] S. Liu, L. Yuan, P. Tan, and J. Sun. Steadyflow: Spatially smooth optical flow for video stabilization. In CVPR, pages 4209–4216, 2014. [22] Y.-L. Liu, W.-S. Lai, M.-H. Yang, Y.-Y. Chuang, and J.-B. Huang. Hybrid neural fusion for full-frame video stabilization. In ICCV, pages 2299–2308, 2021. [23] Y. Matsushita, E. Ofek, X. Tang, and H.-Y. Shum. Full-frame video stabilization. In CVPR, volume 1, pages 50–57, 2005. [24] S. Niklaus and F. Liu. Context-aware synthesis for video frame interpolation. In CVPR, pages 1701–1710, 2018. [25] A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, et al. Pytorch: An imperative style, high-performance deep learning library. NeurIPS, 32, 2019. [26] Z. Shi, F. Shi, W.-S. Lai, C.-K. Liang, and Y. Liang. Deep online fused video stabilization. In WACV, pages 1250–1258, 2022. [27] K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014. [28] B. M. Smith, L. Zhang, H. Jin, and A. Agarwala. Light field video stabilization. In ICCV, pages 341–348, 2009. [29] S. Su, M. Delbracio, J. Wang, G. Sapiro, W. Heidrich, and O. Wang. Deep video deblurring for hand-held cameras. In CVPR, pages 1279–1288, 2017. [30] D. Sun, X. Yang, M.-Y. Liu, and J. Kautz. Pwc-net: Cnns for optical flow using pyramid, warping, and cost volume. In CVPR, pages 8934–8943, 2018. [31] M. Tan and Q. Le. Efficientnet: Rethinking model scaling for convolutional neural networks. In ICML, pages 6105–6114, 2019. [32] Z. Teed and J. Deng. Raft: Recurrent all-pairs field transforms for optical flow. In ECCV, pages 402–419, 2020. [33] M. Wang, G.-Y. Yang, J.-K. Lin, S.-H. Zhang, A. Shamir, S.-P. Lu, and S.-M. Hu. Deep online video stabilization with multi-grid warping transformation learning. IEEE TIP, 28(5):2283–2292, 2018. [34] R. Wang, S. M. Pizer, and J.-M. Frahm. Recurrent neural network for (un-) supervised learning of monocular video visual odometry and depth. In CVPR, pages 5555–5564, 2019. [35] S.-Z. Xu, J. Hu, M. Wang, T.-J. Mu, and S.-M. Hu. Deep video stabilization using adversarial networks. In Computer Graphics Forum, volume 37, pages 267–276. Wiley Online Library, 2018. [36] Y. Xu, J. Zhang, S. J. Maybank, and D. Tao. Dut: Learning video stabilization by simply watching unstable videos. IEEE TIP, 2022. [37] Y. Xu, J. Zhang, and D. Tao. Out-of-boundary view synthesis towards full-frame video stabilization. In ICCV, pages 4842–4851, 2021. [38] Z. Yin and J. Shi. Geonet: Unsupervised learning of dense depth, optical flow and camera pose. In CVPR, pages 1983–1992, 2018. [39] J. Yu and R. Ramamoorthi. Robust video stabilization by optimization in cnn weight space. In CVPR, pages 3800–3808, 2019. [40] J. Yu and R. Ramamoorthi. Learning video stabilization using optical flow. In CVPR, pages 8159–8167, 2020. [41] M. Zhao and Q. Ling. Pwstablenet: Learning pixel-wise warping maps for video stabilization. IEEE TIP, 29:3582–3595, 2020. [42] J. Zhou, Y. Wang, K. Qin, and W. Zeng. Moving indoor: Unsupervised video depth learning in challenging environments. In ICCV, pages 8618–8627, 2019. [43] T. Zhou, M. Brown, N. Snavely, and D. G. Lowe. Unsupervised learning of depth and ego-motion from video. In CVPR, pages 1851–1858, 2017. [44] Y. Zou, Z. Luo, and J.-B. Huang. Df-net: Unsupervised joint learning of depth and flow using cross-task consistency. In ECCV, pages 36–53, 2018. | |
| dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/84867 | - |
| dc.description.abstract | 影片穩定是一種減少影片中不想要的晃動並保留影片主要運動軌跡的技術, 特別是對於手持拍攝裝置,因為在拍攝影片時更容易產生晃動,所以更需要仰賴 影片穩定這項技術。以往許多影片穩定方法做出的結果,需要透過裁切或者放大 來解決因為轉換之後造成的影像資訊流失,進而造成穩定影片的視野相對於原始 影片變得比較小,對於一些晃動較為劇烈的影片,為了達到穩定的結果,常常導 致過度裁切而影響觀看的品質。我們提出了一個基於三維深度資訊及影像融合的 深度學習影像穩定方法,來產生全幅且高品質的影片穩定結果。首先我們利用兩 個卷積神經網路對於輸入影片進行最佳化學習以用來估測場景的深度與相機的三 維運動軌跡。接著我們對於重建出的相機運動軌跡進行平滑處理,並計算原始相 機軌跡與平滑相機軌跡之間的轉換,為了填補邊緣區域在經過轉換後遺失的影像 資訊,我們將多張相鄰幀轉換到關鍵幀進行影像融合,我們分別在特徵尺度和影 像尺度進行了兩次的融合。實驗結果顯示,我們提出方法與先前的方法相比,能 夠產生低失真的全幅影片穩定結果。 | zh_TW |
| dc.description.abstract | Video stabilization is a technique used to reduce the undesired shakes and jitters of videos. Existing methods often require cropping due to the loss of pixels around the image boundaries after warping, leading to a smaller field of views. For shaky videos, the over-cropping issue is even more severe in order to stabilize the videos. We propose 3D Full-frame Stabilizer (3DFS), a 3D depth-based deep learning method using hybrid fusion for full-frame video stabilization. We first optimize two CNNs directly on the input video to predict scene depths and camera poses. We then smooth the reconstructed camera trajectory and estimate the warping field between original and smoothed trajectories. To render the loss pixels around the image boundaries after warping, we align multiple neighboring frames towards the target frame in both feature space and image space. Experiments show that the proposed method generates low distortion full-frame stabilization results compared to previous methods. | en |
| dc.description.provenance | Made available in DSpace on 2023-03-19T22:29:52Z (GMT). No. of bitstreams: 1 U0001-2608202220451500.pdf: 2211086 bytes, checksum: 89dffa5489c4332d516028843e19c078 (MD5) Previous issue date: 2022 | en |
| dc.description.tableofcontents | 摘要 i Abstract ii Contents iii List of Figures v List of Tables vi Chapter 1 Introduction 1 Chapter 2 Related Work 5 2.1 Traditional 2D-Based Video Stabilization Methods . . . . . . . . . . 5 2.2 Traditional 3D-Based Video Stabilization Methods . . . . . . . . . . 6 2.3 Learning-Based Video Stabilization Methods . . . . . . . . . . . . . 6 2.4 Full-Frame Video Stabilization Methods . . . . . . . . . . . . . . . . 7 Chapter 3 Proposed Method 9 3.1 Motion Estimation and Smoothing . . . . . . . . . . . . . . . . . . . 9 3.1.1 Loss Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 3.1.2 Trajectory Smoothing . . . . . . . . . . . . . . . . . . . . . . . . . 13 3.2 Warping and Fusion . . . . . . . . . . . . . . . . . . . . . . . . . . 14 3.2.1 Warping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 3.2.2 Fusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 3.2.3 Fusion Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 3.2.4 Loss Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 3.2.5 Training Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 3.3 Implementation Details . . . . . . . . . . . . . . . . . . . . . . . . . 18 Chapter 4 Experimental Results 20 4.1 Qualitative Comparisons . . . . . . . . . . . . . . . . . . . . . . . . 20 4.2 Quantitative Evaluations . . . . . . . . . . . . . . . . . . . . . . . . 22 4.2.1 Quantitative Comparisons . . . . . . . . . . . . . . . . . . . . . . . 24 4.2.2 Ablation Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 Chapter 5 Conclusions 30 References 32 | |
| dc.language.iso | zh-TW | |
| dc.subject | 深度估測 | zh_TW |
| dc.subject | 深度學習 | zh_TW |
| dc.subject | 影像合成 | zh_TW |
| dc.subject | 相機姿態估測 | zh_TW |
| dc.subject | 影片穩定 | zh_TW |
| dc.subject | video stabilization | en |
| dc.subject | deep learning | en |
| dc.subject | view synthesis | en |
| dc.subject | camera pose estimation | en |
| dc.subject | depth estimation | en |
| dc.title | 運用混合神經網路融合於基於深度之全幅視訊穩定 | zh_TW |
| dc.title | Depth-Based Full-Frame Video Stabilization Using Hybrid Neural Fusion | en |
| dc.type | Thesis | |
| dc.date.schoolyear | 110-2 | |
| dc.description.degree | 碩士 | |
| dc.contributor.oralexamcommittee | 莊永裕(Yung-Yu Chuang),王鈺強(Yu-Chiang Wang) | |
| dc.subject.keyword | 影片穩定,深度估測,相機姿態估測,影像合成,深度學習, | zh_TW |
| dc.subject.keyword | video stabilization,depth estimation,camera pose estimation,view synthesis,deep learning, | en |
| dc.relation.page | 36 | |
| dc.identifier.doi | 10.6342/NTU202202875 | |
| dc.rights.note | 同意授權(限校園內公開) | |
| dc.date.accepted | 2022-08-29 | |
| dc.contributor.author-college | 電機資訊學院 | zh_TW |
| dc.contributor.author-dept | 資訊網路與多媒體研究所 | zh_TW |
| dc.date.embargo-lift | 2022-09-12 | - |
| 顯示於系所單位: | 資訊網路與多媒體研究所 | |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| U0001-2608202220451500.pdf 授權僅限NTU校內IP使用(校園外請利用VPN校外連線服務) | 2.16 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
