請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/96073完整後設資料紀錄
| DC 欄位 | 值 | 語言 |
|---|---|---|
| dc.contributor.advisor | 簡韶逸 | zh_TW |
| dc.contributor.advisor | Shao-Yi Chien | en |
| dc.contributor.author | 潘世軒 | zh_TW |
| dc.contributor.author | Shih-Hsuan Pan | en |
| dc.date.accessioned | 2024-10-11T16:07:12Z | - |
| dc.date.available | 2024-10-12 | - |
| dc.date.copyright | 2024-10-11 | - |
| dc.date.issued | 2024 | - |
| dc.date.submitted | 2024-09-27 | - |
| dc.identifier.citation | “relative acuity of the human eye.” [Online]. Available: https://en.wikipedia.org/wiki/File:AcuityHumanEye.svg v, 3
“Illustration of foveated streaming.” [Online]. Available: https://www.tobii.com/products/integration/xr-headsets/foveation-technology v, 4 T. Isobe, F. Zhu, X. Jia, and S. Wang, “Revisiting temporal modeling forvideo super-resolution,” arXiv preprint arXiv:2008.05765, 2020. v, 2, 9, 10, 39 K. C. Chan, S. Zhou, X. Xu, and C. C. Loy, “Basicvsr++: Improving video super-resolution with enhanced propagation and alignment,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2022, pp. 5972–5981. v, 2, 4, 9, 10, 11, 12, 13, 22, 23, 25, 39, 40, 43 J. Dai, H. Qi, Y. Xiong, Y. Li, G. Zhang, H. Hu, and Y. Wei, “Deformable convolutional networks,” in Proceedings of the IEEE international conference on computer vision, 2017, pp. 764–773. v, 11, 12 K. C. Chan, X. Wang, K. Yu, C. Dong, and C. C. Loy, “Understanding deformable alignment in video super-resolution,” in Proceedings of the AAAI conference on artificial intelligence, vol. 35, no. 2, 2021, pp. 973–981. v, 12, 13, 21 L. Wang, M. Hajiesmaili, and R. K. Sitaraman, “Focas: Practical video super resolution using foveated rendering,” in Proceedings of the 29th ACM International Conference on Multimedia, 2021, pp. 5454–5462. v, 5, 14, 15 S. Baker, D. Scharstein, J. P. Lewis, S. Roth, M. J. Black, and R. Szeliski, “A database and evaluation methodology for optical flow,” International journal of computer vision, vol. 92, pp. 1–31, 2011. v, 23, 24 T. Vigier, J. Rousseau, M. P. Da Silva, and P. Le Callet, “A new hd and uhd video eye tracking dataset,” in Proceedings of the 7th international conference on multimedia systems, 2016, pp. 1–6. vi, 18, 36, 37 C. Liu and D. Sun, “On bayesian adaptive video super resolution,” IEEE transactions on pattern analysis and machine intelligence, vol. 36, no. 2, pp. 346–360, 2013. vi, vii, 36, 37, 39, 42 P. Yi, Z. Wang, K. Jiang, J. Jiang, and J. Ma, “Progressive fusion video superresolution network via exploiting non-local spatio-temporal correlations,” in Proceedings of the IEEE/CVF international conference on computer vision, 2019, pp. 3106–3115. vii, 36, 37, 39 K. C. Chan, X. Wang, K. Yu, C. Dong, and C. C. Loy, “Basicvsr: The search for essential components in video super-resolution and beyond,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021, pp. 4947–4956. 2, 9, 10, 11, 39 Y. Tian, Y. Zhang, Y. Fu, and C. Xu, “Tdan: Temporally-deformable alignment network for video super-resolution,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 3360–3369. 2, 12 X. Wang, K. C. Chan, K. Yu, C. Dong, and C. Change Loy, “Edvr: Video restoration with enhanced deformable convolutional networks,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, 2019, pp. 0–0. 2 J. Liang, J. Cao, Y. Fan, K. Zhang, R. Ranjan, Y. Li, R. Timofte, and L. Van Gool, “Vrt: A video restoration transformer,” IEEE Transactions on Image Processing, 2024. 2 A. Ranjan and M. J. Black, “Optical flow estimation using a spatial pyramid network,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 4161–4170. 2, 22 J. Ryoo, K. Yun, D. Samaras, S. R. Das, and G. Zelinsky, “Design and evaluation of a foveated video streaming service for commodity client devices,”in Proceedings of the 7th International Conference on Multimedia Systems, 2016, pp. 1–11. 3, 4 D. Li, R. Du, A. Babu, C. D. Brumar, and A. Varshney, “A log-rectilinear transformation for foveated 360-degree video streaming,” IEEE Transactions on Visualization and Computer Graphics, vol. 27, no. 5, pp. 2638–2647, 2021. 4 Z. Li, S. Qin, and L. Itti, “Visual attention guided bit allocation in video compression,” Image and Vision Computing, vol. 29, no. 1, pp. 1–14, 2011. 7, 39, 40 H. Salehinejad, S. Sankar, J. Barfett, E. Colak, and S. Valaee, “Recent advances in recurrent neural networks,” arXiv preprint arXiv:1801.01078, 2017. 9 D. Li, Y. Liu, and Z. Wang, “Video super-resolution using motion compensation and residual bidirectional recurrent convolutional network,” in 2017 IEEE International Conference on Image Processing (ICIP). IEEE, 2017, pp. 1642–1646. 9 T. Isobe, X. Jia, X. Tao, C. Li, R. Li, Y. Shi, J. Mu, H. Lu, and Y.-W. Tai, “Look back and forth: Video super-resolution with explicit temporal difference modeling,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 17 411–17 420. 9 T. Isobe, X. Jia, S. Gu, S. Li, S. Wang, and Q. Tian, “Video super-resolution with recurrent structure-detail network,” in Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XII 16. Springer, 2020, pp. 645–660. 9 X. Yang, X. Zhang, and L. Zhang, “Flow-guided deformable attention network for fast online video super-resolution,” in 2023 IEEE International Conference on Image Processing (ICIP). IEEE, 2023, pp. 390–394. 9 D. Fuoli, S. Gu, and R. Timofte, “Efficient video super-resolution throughrecurrent latent space propagation,” in 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW). IEEE, 2019, pp. 3476–3485. 9 M. S. Sajjadi, R. Vemulapalli, and M. Brown, “Frame-recurrent video superresolution,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 6626–6634. 9 S. Nah, S. Son, and K. M. Lee, “Recurrent neural networks with intra-frame iterations for video deblurring,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 8102–8111. 9 Z. Zhong, Y. Gao, Y. Zheng, and B. Zheng, “Efficient spatio-temporal recurrent neural network for video deblurring,” in Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part VI 16. Springer, 2020, pp. 191–207. 9 J. Li, X. Wu, Z. Niu, and W. Zuo, “Unidirectional video denoising by mimicking backward recurrent modules with look-ahead forward ones,” in European Conference on Computer Vision. Springer, 2022, pp. 592–609. 9 X. Chen, L. Song, and X. Yang, “Deep rnns for video denoising,” in Applications of digital image processing XXXIX, vol. 9971. SPIE, 2016, pp. 573–582. 9 X. Xiang, Y. Tian, Y. Zhang, Y. Fu, J. P. Allebach, and C. Xu, “Zooming slow-mo: Fast and accurate one-stage space-time video super-resolution,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 3370–3379. 9 L. Deng, M. Yang, H. Li, T. Li, B. Hu, and C. Wang, “Restricted deformable convolution-based road scene semantic segmentation using surround view cameras,” IEEE Transactions on Intelligent Transportation Systems, vol. 21, no. 10, pp. 4350–4362, 2019. 12 C.-Y. Wang, A. Bochkovskiy, and H.-Y. M. Liao, “Yolov7: Trainable bag-offreebies sets new state-of-the-art for real-time object detectors,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2023, pp. 7464–7475. 12 T. Xue, B. Chen, J. Wu, D. Wei, and W. T. Freeman, “Video enhancement with task-oriented flow,” International Journal of Computer Vision, vol. 127, pp. 1106–1125, 2019. 14, 30, 35, 37 Z. Zhang, R. Li, S. Guo, Y. Cao, and L. Zhang, “Tmp: Temporal motion propagation for online video super-resolution,” arXiv preprint arXiv:2312.09909, 2023. 14, 27, 39 E. Lee, L.-F. Hsu, E. Chen, and C.-Y. Lee, “Cross-resolution flow propagation for foveated video super-resolution,” in Proceedings of the IEEE/CVF winter conference on applications of computer vision, 2023, pp. 1766–1775. 14 X. Zhu, H. Hu, S. Lin, and J. Dai, “Deformable convnets v2: More deformable, better results,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 9308–9316. 28 M. Land and B. Tatler, Looking and acting: vision and eye movements in natural behaviour. Oxford University Press, USA, 2009. 29 R. H. Carpenter, Movements of the Eyes, 2nd Rev. Pion Limited, 1988. 29 L. R. Young and D. Sheena, “Survey of eye movement recording methods,” Behavior research methods & instrumentation, vol. 7, no. 5, pp. 397–429, 1975. 29 A. L. Yarbus, Eye movements and vision. Springer, 2013. 29 R. Carpenter, “The visual origins of ocular motility,” Vision and visual dysfunction, vol. 8, no. 1-10, p. 3, 1991. 29 S. Nah, S. Baik, S. Hong, G. Moon, S. Son, R. Timofte, and K. Mu Lee, “Ntire 2019 challenge on video deblurring and super-resolution: Dataset and study,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, 2019, pp. 0–0. 30, 35, 37 A. Watson and O. Alex Nasa, “cudacanvas: Pytorch tensor image display in cuda,” https://github.com/OutofAi/cudacanvas, 2024. 32 X. Wang, L. Xie, K. Yu, K. C. Chan, C. C. Loy, and C. Dong, “BasicSR: Open source image and video restoration toolbox,” https://github.com/XPixelGroup/BasicSR, 2022. 36, 38 D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014. 37 I. Loshchilov and F. Hutter, “Sgdr: Stochastic gradient descent with warm restarts,” arXiv preprint arXiv:1608.03983, 2016. 37 W.-S. Lai, J.-B. Huang, N. Ahuja, and M.-H. Yang, “Deep laplacian pyramid networks for fast and accurate super-resolution,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 624–632. 38 | - |
| dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/96073 | - |
| dc.description.abstract | 影片超解析度(VSR)技術能提升影片的解析度,為使用者提供更清晰的視覺體驗。然而,VSR 通常會因為較長的延遲,造成難以實時運行。此限制讓即時影片增強難以達成。
注視點串流利用了人類視覺系統(HVS)的特性,視覺敏銳度在注視點周圍較高,因此在注視點區域傳輸高解析度的內容;外圍區域對使用者的視覺影響較小,則使用較低解析度。此技術可以有效減少串流的傳輸頻寬和延遲並維持相同的視覺體驗。 在本論文中,我們提出了一個結合注視點串流概念的影片超解析度系統。透過眼動追蹤器獲取使用者的注視點,我們能將計算資源集中在注視區域,以實現實時的影片超解析度。我們也提出了一個可變形對齊與光流傳播模組,以有效對齊相鄰幀之間的特徵。此外,我們使用平移對齊來解決由於注視窗口位移所導致的錯位問題。我們的注視點影片超解析度系統大幅降低了超高畫質(UHD)影片超解析度的計算量,並將延遲減少約 90%,更保證了同等甚至更好的視覺體驗。此技術未來有望應用於配備眼動追蹤器的顯示設備中,為使用者提供超低延遲的解析度增強功能。 | zh_TW |
| dc.description.abstract | Video super-resolution (VSR) technology enhances the resolution of videos, providing users with a more precise visual experience. However, VSR often suffers from long latency and the inability to be performed online, making real-time video enhancement impractical.
Foveated streaming leverages the human vision system's (HVS) characteristics by streaming high-resolution content around the viewport while using a lower resolution for peripheral areas, contributing less to the user's visual perception. This approach can be applied to reduce the transmission bandwidth and latency of streaming. In this thesis, we introduce a foveated video super-resolution system utilizing the concept of foveated streaming. Using an eye tracker to obtain the user's gaze position, we allocate computational resources to the focal area to achieve real-time online video super-resolution. We also propose a deformable alignment and flow propagation module to align features between adjacent frames efficiently. Additionally, shift alignment is employed to address misalignment caused by the movement of the foveated window. Our foveated video super-resolution system significantly reduces the computational load of ultra-high definition (UHD) video super-resolution, saving approximately 90% latency while ensuring the same or even better visual experience for the user. This technology has the potential to be applied in display devices equipped with eye trackers, providing users with ultra-low latency resolution enhancement. | en |
| dc.description.provenance | Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2024-10-11T16:07:12Z No. of bitstreams: 0 | en |
| dc.description.provenance | Made available in DSpace on 2024-10-11T16:07:12Z (GMT). No. of bitstreams: 0 | en |
| dc.description.tableofcontents | Master’s Thesis Acceptance Certificate i
Acknowledgement iii Chinese Abstract v Abstract vii Contents ix List of Figures xi List of Tables xiii 1 Introduction 1 1.1 Video Super-Resolution 1 1.2 Foveated Streaming 2 1.3 Challenges 4 1.4 Contribution 6 1.5 Thesis Organization 7 2 Related Work 9 2.1 Video Super-Resolution 9 2.1.1 Recurrent Networks 9 2.1.2 Deformable Alignment 11 2.1.3 Temporal Motion Propagation 14 2.2 Video Super-Resolution with Foveated Rendering 14 3 Proposed Method 17 3.1 Overview of Foveated Video Super-Resolution System 17 3.2 Shift Alignment 19 3.3 Deformable Alignment and Flow Propagation 21 3.4 Restart Mechanism 29 3.5 Two Stage Training 30 3.6 System Implementation 32 4 Experiments 35 4.1 Datasets 35 4.2 Training and Implementation Details 36 4.3 Performance Evaluation 38 4.3.1 Performance Evaluation of General VSR 38 4.3.2 Eye-weighted Metric 39 4.3.3 Performance Evaluation of Foveated VSR 40 4.4 Effectiveness of Deformable Alignment and Flow Propagation module 42 4.5 Ablation Study 45 5 Conclusion 49 Reference 51 | - |
| dc.language.iso | en | - |
| dc.title | 基於可變形對齊和光流傳播之實時注視點影片超解析度系統 | zh_TW |
| dc.title | Real-Time Foveated Video Super-Resolution System with Deformable Alignment and Flow Propagation | en |
| dc.type | Thesis | - |
| dc.date.schoolyear | 113-1 | - |
| dc.description.degree | 碩士 | - |
| dc.contributor.oralexamcommittee | 施吉昇;莊永裕;陳冠文 | zh_TW |
| dc.contributor.oralexamcommittee | Chi-Sheng Shih;Yung-Yu Chuang;Kuan-Wen Chen | en |
| dc.subject.keyword | 超解析度,影片,注視點渲染, | zh_TW |
| dc.subject.keyword | Super-Resolution,Video,Foveated Rendering, | en |
| dc.relation.page | 57 | - |
| dc.identifier.doi | 10.6342/NTU202404417 | - |
| dc.rights.note | 同意授權(限校園內公開) | - |
| dc.date.accepted | 2024-09-27 | - |
| dc.contributor.author-college | 電機資訊學院 | - |
| dc.contributor.author-dept | 電子工程學研究所 | - |
| 顯示於系所單位: | 電子工程學研究所 | |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| ntu-113-1.pdf 授權僅限NTU校內IP使用(校園外請利用VPN校外連線服務) | 21.93 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
