透過無參考圖像質量評估揀選神經輻射場不確定區域之主動式神經輻射場拍攝

岳哲仰; Jhe-Yang Yue

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/87294

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	陳炳宇	zh_TW
dc.contributor.advisor	Bing-Yu Chen	en
dc.contributor.author	岳哲仰	zh_TW
dc.contributor.author	Jhe-Yang Yue	en
dc.date.accessioned	2023-05-18T16:53:39Z	-
dc.date.available	2023-11-09	-
dc.date.copyright	2023-05-14	-
dc.date.issued	2022	-
dc.date.submitted	2002-01-01	-
dc.identifier.citation	A. Chen, Z. Xu, A. Geiger, , J. Yu, and H. Su. Tensorf: Tensorial radiance fields, 2022. S. Chen, Y. Li, and N. M. Kwok. Active vision in robotic systems: A survey of recent developments. The International Journal of Robotics Research, 30(11):1343–1377, 2011. C. Connolly. The determination of next best views. In Proceedings. 1985 IEEE International Conference on Robotics and Automation, volume 2, pages 432–435, 1985. J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. Imagenet: A largescale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition, pages 248–255, 2009. K. Deng, A. Liu, J.-Y. Zhu, and D. Ramanan. Depth-supervised nerf: Fewer views and faster training for free. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12882–12891, 2022. S. A. Eslami, D. Jimenez Rezende, F. Besse, F. Viola, A. S. Morcos, M. Garnelo, A. Ruderman, A. A. Rusu, I. Danihelka, K. Gregor, et al. Neural scene representation and rendering. Science, 360(6394):1204–1210, 2018. L. Hou, C.-P. Yu, and D. Samaras. Squared earth mover’s distance-based loss for training deep neural networks. arXiv preprint arXiv:1611.05916, 2016. S. Isler, R. Sabzevari, J. Delmerico, and D. Scaramuzza. An information gain formulation for active volumetric 3d reconstruction. In 2016 IEEE International Conference on Robotics and Automation (ICRA), pages 3477–3484. IEEE, 2016. A. Jain, M. Tancik, and P. Abbeel. Putting nerf on a diet: Semantically consistent few-shot view synthesis, 2021. S. Khalfaoui, R. Seulin, Y. Fougerolle, and D. Fofi. An efficient method for fully automatic 3d digitization of unknown objects. Computers in Industry, 64(9):1152–1160, 2013. S. Kriegel, C. Rink, T. Bodenmüller, and M. Suppa. Efficient next-best-scan planning for autonomous 3d surface reconstruction of unknown objects. Journal of Real-Time Image Processing, 10(4):611–631, 2015. E. Levina and P. Bickel. The earth mover’s distance is the mallows distance: some insights from statistics. In Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001, volume 2, pages 251–256 vol.2, 2001. B. Mildenhall, P. P. Srinivasan, M. Tancik, J. T. Barron, R. Ramamoorthi, and R. Ng. Nerf: Representing scenes as neural radiance fields for view synthesis. In ECCV, 2020. A. Mittal, A. K. Moorthy, and A. C. Bovik. No-reference image quality assessment in the spatial domain. IEEE Transactions on image processing, 21(12):4695–4708, 2012. A. Mittal, R. Soundararajan, and A. C. Bovik. Making a“completely blind＂image quality analyzer. IEEE Signal processing letters, 20(3):209–212, 2012. T. Müller, A. Evans, C. Schied, and A. Keller. Instant neural graphics primitives with a multiresolution hash encoding. ACM Trans. Graph., 41(4):102:1–102:15, July 2022. J. L. Schönberger and J.-M. Frahm. Structure-from-motion revisited. In Conference on Computer Vision and Pattern Recognition (CVPR), 2016. J. L. Schönberger, E. Zheng, M. Pollefeys, and J.-M. Frahm. Pixelwise view selection for unstructured multi-view stereo. In European Conference on Computer Vision (ECCV), 2016. H. Sheikh, M. Sabir, and A. Bovik. A statistical evaluation of recent full reference image quality assessment algorithms. IEEE Transactions on Image Processing, 15(11):3440–3451, 2006. K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014. C. Sun, M. Sun, and H. Chen. Direct voxel grid optimization: Super-fast convergence for radiance fields reconstruction. In CVPR, 2022. H. Talebi and P. Milanfar. Nima: Neural image assessment. IEEE transactions on image processing, 27(8):3998–4011, 2018. A. Tewari, O. Fried, J. Thies, V. Sitzmann, S. Lombardi, K. Sunkavalli, R. Martin-Brualla, T. Simon, J. Saragih, M. Nießner, et al. State of the art on neural rendering. In Computer Graphics Forum, volume 39, pages 701–727. Wiley Online Library, 2020. S. Wu, W. Sun, P. Long, H. Huang, D. Cohen-Or, M. Gong, O. Deussen, and B. Chen. Quality-driven poisson-guided autoscanning. ACM Transactions on Graphics, 33(6), 2014. K. Zhang, G. Riegler, N. Snavely, and V. Koltun. Nerf++: Analyzing and improving neural radiance fields. arXiv preprint arXiv:2010.07492, 2020. R. Zhang, P. Isola, A. A. Efros, E. Shechtman, and O. Wang. The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 586–595, 2018. W. Zhang, K. Ma, J. Yan, D. Deng, and Z. Wang. Blind image quality assessment using a deep bilinear convolutional neural network. IEEE Transactions on Circuits and Systems for Video Technology, 30(1):36–47, 2018.	-
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/87294	-
dc.description.abstract	作為體積神經運算中搏得驚豔的渲染結果之方法，神經輻射場（NeRF）已在電腦視覺以及圖學上發展迅速。然而，為了得到擬真的畫面，伴隨而來的是較多的訓練資料以及龐大的運算時間，近期地研究方向著重快速的收斂來減少運算時間以及使用額外的限制來達到較少的訓練資料。其中，鮮少有人研究如何從NeRF中省去繁冗的圖像拍攝以及揀選，從而達到逐步構建出質量較好的結果。在本論文中，我們展示了一個循序揀選最佳影像的方式，來為廣大的視角挑選做了一個建議。為了達到此做法，我們定義了由NeRF產生出來的瑕疵圖片風格，並蒐集由該圖片以及相對應的品質分數等資料，透過無參考圖像質量評估的方式，進而學習了一個模型去判斷NeRF所預測出來圖像的不確定性評比，對於一個選定的場景，我們建立一個相機攝影場域範本，在其上透過此評比協助我們能夠循序的找到重建NeRF場景中必要的視角，重而得到較佳的結果。透過我們方法中推薦的視角選取，我們展示了該方法能夠給予使用者一個拍攝的方向，比起無標準隨機拍攝可以得到更佳的NeRF。	zh_TW
dc.description.abstract	As a fascinating representation of neural volume, Neural Radiance Fields (NeRF) have got significant results for learning to represent 3D scenes. However, to produce photorealistic images, which are accompanied by lengthy training time and more training data, is often an improvement direction like training with fewer datasets or fast convergence to reduce the cost of training time. Besides, capturing real-world scenes requires tons of shooting images from different directions. To reduce the labor burden and clarify the missing part of NeRF, we present a sequence scanning technique that reduces the number of images required, which suggests the view you should take based on what you have already shot. This goal is achieved by scanning at strategically selected Next-Best-Views (NBVs) to capture the object's geometric details progressively. The key of our method is the uncertainty analysis according to the RGB images predicted by NeRF model. We can sequentially capture the missing view through the uncertainty analysis and get a better reconstruction.	en
dc.description.provenance	Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2023-05-18T16:53:39Z No. of bitstreams: 0	en
dc.description.provenance	Made available in DSpace on 2023-05-18T16:53:39Z (GMT). No. of bitstreams: 0	en
dc.description.tableofcontents	口試委員審定書i 致謝ii 中文摘要iii Abstract iv Contents vi List of Figures viii List of Tables xi Chapter 1 Introduction 1 Chapter 2 Related Work 4 2.1 Novel View Synthesis . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.2 View Planning Problem . . . . . . . . . . . . . . . . . . . . . . . . 5 2.3 No Reference Image Quality Assessment . . . . . . . . . . . . . . . 5 Chapter 3 Preliminary 7 3.1 Preliminary Experiment for Quality Assessment . . . . . . . . . . . 7 3.2 Data Preparation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 3.3 CNN-based Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 3.3.1 NRIQA by Neural Image Assessment Model . . . . . . . . . . . . . 10 vi 3.3.2 NRIQA by DBCNN Model . . . . . . . . . . . . . . . . . . . . . . 11 3.3.3 Result from CNN-based Model . . . . . . . . . . . . . . . . . . . . 11 Chapter 4 Method and Workflow 13 4.1 Construction of Capture Field . . . . . . . . . . . . . . . . . . . . . 14 4.1.1 Capture Field for Synthetic Environment . . . . . . . . . . . . . . . 14 4.1.2 Capture Field for Real-World Environment . . . . . . . . . . . . . . 16 4.2 Uncertainty Assessment . . . . . . . . . . . . . . . . . . . . . . . . 17 4.3 Next-Best-View Sampling . . . . . . . . . . . . . . . . . . . . . . . 19 Chapter 5 Experiments 20 5.1 Experiment Detail . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 5.2 Comparative Results . . . . . . . . . . . . . . . . . . . . . . . . . . 21 5.3 Capturing Nearby . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 5.4 Depth Metric Embedment . . . . . . . . . . . . . . . . . . . . . . . 28 5.4.1 Depth Metric Calculation . . . . . . . . . . . . . . . . . . . . . . . 28 5.4.2 Depth Metric Problem . . . . . . . . . . . . . . . . . . . . . . . . . 31 5.5 Quantitative Comparisons . . . . . . . . . . . . . . . . . . . . . . . 31 5.6 Real-world Capturing . . . . . . . . . . . . . . . . . . . . . . . . . . 37 Chapter 6 Limitations and Future Work 40 Chapter 7 Conclusion 42 References 43	-
dc.language.iso	zh_TW	-
dc.subject	無參考圖像質量評估	zh_TW
dc.subject	神經輻射場	zh_TW
dc.subject	下一個最佳視角	zh_TW
dc.subject	Neural Radiance Field	en
dc.subject	No-Reference Image Quality Assessment	en
dc.subject	Next-Best-View	en
dc.title	透過無參考圖像質量評估揀選神經輻射場不確定區域之主動式神經輻射場拍攝	zh_TW
dc.title	Active Neural Radiance Field Capturing with No-Reference Image Quality Assessment	en
dc.type	Thesis	-
dc.date.schoolyear	111-1	-
dc.description.degree	碩士	-
dc.contributor.oralexamcommittee	詹力韋;程芙茵;沈奕超	zh_TW
dc.contributor.oralexamcommittee	Li-Wei Chan;Fu-Yin Cherng;I-Chao Shen	en
dc.subject.keyword	神經輻射場,無參考圖像質量評估,下一個最佳視角,	zh_TW
dc.subject.keyword	Neural Radiance Field,No-Reference Image Quality Assessment,Next-Best-View,	en
dc.relation.page	46	-
dc.identifier.doi	10.6342/NTU202204263	-
dc.rights.note	同意授權(全球公開)	-
dc.date.accepted	2022-10-11	-
dc.contributor.author-college	電機資訊學院	-
dc.contributor.author-dept	資訊工程學系	-
dc.date.embargo-lift	2023-10-07	-
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
ntu-111-1.pdf	32.07 MB	Adobe PDF	檢視/開啟

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。