基於加權算法與機器學習之影像超解析技術

Yi-Wen Chen; 陳意雯

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/21472

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	丁建均
dc.contributor.author	Yi-Wen Chen	en
dc.contributor.author	陳意雯	zh_TW
dc.date.accessioned	2021-06-08T03:35:04Z	-
dc.date.copyright	2019-08-06
dc.date.issued	2019
dc.date.submitted	2019-07-31
dc.identifier.citation	[1] R. Keys, “Cubic Convolution Interpolation for Digital Image Processing,” IEEE Transactions on Acoustics, Speech, Signal Processing, vol. 29, no. 6, pp. 1153-1160, 1981. [2] X. Li and M. Orchard, “New edge-directed interpolation,” in Proceedings of the IEEE International Conference on Image Processing, 2000. [3] D. Zhou, X. Shen, and W. Dong, “Image zooming using directional cubic convolution interpolation,” IET image processing, vol. 6, no. 6, pp. 627-634, 2012. [4] H. Chang, D.-Y. Yeung, and Y. Xiong, “Super-resolution through neighbor embedding,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2004. [5] S. Roweis, S. Lawrence, “Nonlinear dimensionality reduction by locally linear embedding,” Science (2000) 2323-2326. [6] J. Yang, J. Wright, T. Huang, and Y. Ma, “Image super-resolution as sparse representation of raw image patches,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2008. [7] R. Zeyde, M. Elad, and M. Protter, “On single image scale-up using sparse-representations,” in Curves and Surfaces, 2010. [8] C.-Y. Yang and M.-H. Yang, “Fast direct super-resolution by simple functions,” in Proceedings of the IEEE International Conference on Computer Vision, 2013. [9] R. Timofte, V. De Smet, L. Van Gool, “Anchored neighborhood regression for fast example-based super-resolution,” in Proceedings of the IEEE International Conference on Computer Vision, 2013. [10] R. Timofte, L. Van Gool, “Adaptive and weighted collaborative representations for image classification,” Pattern Recognition Letters 43 (2014) 127-135. [11] R. Timofte, V. De Smet, and L. Van Gool, “A+: Adjusted anchored neighborhood regression for fast super-resolution,” in Proceedings of the Asian Conference on Computer Vision, 2014. [12] S. Schulter, C. Leistner, and H. Bischof, “Fast and accurate image upscaling with super-resolution forests,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015. [13] Y. Amit and D. Geman, “Shape Quantization and Recognition with Randomized Trees,” Neural Computation, 9(7):1545–1588, 1997. [14] L. Breiman, “Random Forests,” Machine Learning, 45(1):5–32, 2001. [15] A. Criminisi and J. Shotton, “Decision Forests for Computer Vision and Medical Image Analysis,” Springer, 2013. [16] C. Dong, C. C. Loy, K. He, and X. Tang, “Image super-resolution using deep convolutional networks,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(2):295–307, 2015. [17] J. Kim, J. K. Lee, and K. M. Lee, “Accurate image super-resolution using very deep convolutional networks,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016. [18] J. Kim, J. K. Lee, and K. M. Lee, “Deeply-recursive convolutional network for image super-resolution,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016. [19] W. Shi, J. Caballero, F. Huszar, J. Totz, A. Aitken, R. Bishop, D. Rueckert, and Z. Wang, “Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016. [20] C. Dong, C. C. Loy, and X. Tang, “Accelerating the super-resolution convolutional neural network,” in Proceedings of the European Conference on Computer Vision, 2016. [21] Z. Wang, D. Liu, J. Yang, W. Han, and T. Huang, “Deep networks for image super-resolution with sparse prior,” in Proceedings of the IEEE International Conference on Computer Vision, 2015. [22] K. Gregor and Y. LeCun, “Learning fast approximations of sparse coding,” in Proceedings of the International Conference on Machine Learning, 2010. [23] W.-S. Lai, J.-B. Huang, N. Ahuja, and M.-H. Yang, “Deep laplacian pyramid networks for fast and accurate super-resolution,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017. [24] M. Haris, G. Shakhnarovich, and N. Ukita, “Deep back-projection networks for super-resolution,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018. [25] C.-C. Hsu, J.-J. Ding, H.-W. Hsu, C.-W. Chang and Y.-C. Wang, “Directional cubic interpolation algorithm using structure-texture decomposition and advanced Kittler-Illingworth minimum error thresholding scheme,” Int. Conf. Information, Communications and Signal Processing, 2017. [26] J.Kittler and J.Illingworth, “Minimum error thresholding,” Pattern Recognition vol. 19, no. 1, pp. 41-47, 1986. [27] I. Sobel, “An Isotropic 3x3 Image Gradient Operator,” presentation at Stanford A.I. project, 1968. [28] D. Gabor, “Theory of Communication,” J. IEE, vol. 93, pp. 429-459, 1946. [29] C. Cortes, V. Vapnik, “Support-Vector Networks,” Machine Learning, 20, pp.273-297, 1995. [30] B. Lim, S. Son, H. Kim, S. Nah, and K. M. Lee, “Enhanced deep residual networks for single image super-resolution,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2017. [31] A. Haar, “Zur theorie der orthogonalen funktionensysteme,” Math. Annal., vol. 69, pp. 331-371, 1910. [32] G. Huang, Z. Liu, L. van der Maaten, and K. Q. Weinberger, “Densely connected convolutional networks,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017. [33] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradientbased learning applied to document recognition,” Proceedings of the IEEE, 86(11):2278–2324, 1998. [34] R. Pascanu, T. Mikolov, and Y. Bengio, “On the difficulty of training recurrent neural networks,” in Proceedings of the International Conference on Machine Learning, 2013. [35] L. Zhang and X. Wu, “An edge-guided image interpolation algorithm via directional filtering and data fusion,” IEEE Transactions on Image Processing, vol. 15, no. 8, pp. 2226–2238, Aug. 2006. [36] J. Yang, J. Wright, T. S. Huang, and Y. Ma, “Image super-resolution via sparse representation,” IEEE Transactions on Image Processing, 19(11):2861–2873, 2010. [37] D. Martin, C. Fowlkes, D. Tal, and J. Malik, “A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics,” in Proceedings of the IEEE International Conference on Computer Vision, 2001. [38] C. G. Marco Bevilacqua, Aline Roumy and M.-L. A. Morel, “Low-complexity single-image super-resolution based on nonnegative neighbor embedding,” in Proceedings of the British Machine Vision Conference, 2012. [39] R. Zeyde, M. Elad, and M. Protter, “On single image scale-up using sparse-representations,” in Curves and Surfaces, 711–730. Springer, 2012. [40] J.-B. Huang, A. Singh, and N. Ahuja, “Single image super-resolution using transformed self-exemplars,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015. [41] Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, “Image quality assessment: from error visibility to structural similarity,” IEEE Transactions on Image Processing, 13(4):600–612, 2004. [42] A. Paszke, S. Gross, S. Chintala, G. Chanan, E. Yang, Z. DeVito, Z. Lin, A. Desmaison, L. Antiga, and A. Lerer, “Automatic differentiation in pytorch,” in Proceedings of the Advances in Neural Information Processing Systems Workshop, 2017. [43] Y. Tai, J. Yang, and X. Liu, “Image super-resolution via deep recursive residual network,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017. [44] C. Ledig, L. Theis, F. Huszar, J. Caballero, A. Cunningham, A. Acosta, A. Aitken, A. Tejani, J. Totz, Z. Wang, et al., “Photo-realistic single image super-resolution using a generative adversarial network,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/21472	-
dc.description.abstract	現今數位影像取得十分容易，而高解析度的數位影像常需要被用在後續的影像處理和分析上，然而，受限於感光元件和光學上的限制，利用數位相機取得之影像的空間解析度是有限的。為了取得高解析度的影像，提升硬體設備的成本過高，而影像超解析(image super-resolution)提供了一種方便且經濟的解決方法。影像超解析的目標是從一張低解析度影像產生一張高解析度影像，是影像處理中一個基本的問題，此技術常被用在監視系統、醫學診斷、遙測技術等高階電腦視覺應用中。由於一張低解析度影像可以由多張不同的高解析度影像對應到，影像超解析是不存在唯一且穩定解的不適定問題(ill-posed problem)。在本篇論文中，我們提出了兩種影像超解析的方法，一種是擷取不同影像超解析方法的優點，將這些方法做結合的模型，另一種是基於深度學習的方法。傳統的影像超解析方法如雙線性差值、三次卷積差值十分簡單、快速，但會產生模糊(blurring)與振鈴(ringing)等失真的情況。為了解決這些問題，我們提出了一個能夠結合不同方法優點的模型，我們分析了三種影像超解析方法，發現不同的影像超解析方法適用於影像不同特徵的區域，因此我們對影像抽取了多種特徵，藉由統計這些特徵值和各方法產生結果的誤差，針對每一張輸入影像，計算不同方法的加權平均得到最終結果。隨著卷積神經網路(convolutional neural network)與深度學習(deep learning)近年來的發展，利用大量資料訓練出的模型能夠在電腦視覺的許多應用中達到很好的效果，在本篇論文中，我們提出了另一種基於深度學習的模型。由於影像在不同頻帶具備不同的特徵，我們先將影像利用小波轉換(wavelet transform)分成四個不同的頻帶，分別輸入到四個模型中，讓每個模型能學到特定頻帶的特徵，提升訓練的效能。此外，我們也採用了稠密連接(dense connection)來使網路中不同層的特徵能更有效地被運用。在測試階段，我們使用幾何性自集合(geometric self-ensemble)來增強模型的表現。	zh_TW
dc.description.abstract	Nowadays, digital images are easy to access, and high-resolution images are often required for later image processing and analysis. However, the spatial resolution of images captured by digital cameras is limited by principles of optics and the size of imaging sensors. While constructing optical components that can capture very high-resolution images is prohibitively expensive and impractical, image super-resolution (SR) provides a convenient and economical solution. Image super-resolution aims to generate a high-resolution (HR) image from a low-resolution (LR) input image. It is an essential task in image processing and can be utilized in many high-level computer vision applications, such as video surveillance, medical diagnosis and remote sensing. Super-resolution is an ill-posed problem since multiple HR images could correspond to the same LR image. In this thesis, we propose two algorithms for image super-resolution. The first one is to combine and take advantage of different image super-resolution methods while the second one is based on deep learning. Conventional image super-resolution methods, including bilinear interpolation and cubic convolution interpolation, are intuitive and simple to use. However, they often suffer from artifacts such as blurring and ringing. To deal with this problem, we propose a weighting-based algorithm that takes advantage of three different image super-resolution methods and generates the final results from the combination of these methods. We extract features of the input LR image and investigate the performance of the chosen methods under different features. Results from the candidate methods are combined using a weighted average based on the statistical values of the training data. As the development of convolutional neural networks and deep learning in recent years, models trained on large scale of datasets achieve favorable performance on many computer vision applications. In this thesis, we propose another deep learning-based approach for image super-resolution. We use the wavelet transform to separate the input image into four frequency bands, and train a model for each sub-band. By processing information from different frequency bands via different CNN models, we can extract features more efficiently and learn better LR-to-HR mappings. In addition, we add dense connection to the model to make better use of the internal features in the CNN model. Furthermore, geometric self-ensemble is applied in the testing stage to maximize the potential performance.	en
dc.description.provenance	Made available in DSpace on 2021-06-08T03:35:04Z (GMT). No. of bitstreams: 1 ntu-108-R06942046-1.pdf: 2035521 bytes, checksum: b74025c57a894f313cf93d9655b116cf (MD5) Previous issue date: 2019	en
dc.description.tableofcontents	口試委員會審定書 # 誌謝 i 中文摘要 ii ABSTRACT iii CONTENTS v LIST OF FIGURES viii LIST OF TABLES ix Chapter 1 Introduction 1 1.1 Motivation 1 1.2 Primary Contributions 2 Chapter 2 Related Work of Image Super-Resolution 4 2.1 Interpolation-Based Methods 4 2.1.1 Bilinear Interpolation 4 2.1.2 Bicubic Interpolation 5 2.1.3 Spline Interpolation 7 2.1.4 New Edge-Directed Interpolation 7 2.1.5 Directional Cubic Convolution Interpolation 8 2.2 Dictionary-Based Methods 9 2.2.1 Neighbor Embedding 9 2.2.2 Sparse Coding 10 2.2.3 Simple Functions 11 2.2.4 Anchored Neighborhood Regression 12 2.2.5 Adjusted Anchored Neighborhood Regression 13 2.2.6 Random Forests 14 2.3 Convolutional Neural Network-Based Methods 16 2.3.1 Pre-defined Upsampling 17 2.3.2 Single Upsampling 17 2.3.3 Progressive Upsampling 18 2.3.4 Iterative Up and Downsampling 18 2.4 Comparison of Methods 19 Chapter 3 Proposed Weighting-Based Approach 22 3.1.1 Candidate Methods 23 3.1.2 B-Spline Interpolation 23 3.1.3 Kittler-Illingworth Minimum Error Thresholding Directional Cubic Convolution 24 3.1.4 Structural Similarity-Based Edge Directed Image Interpolation 26 3.2 Automatic Mode Selection 28 3.2.1 Sobel Features 29 3.2.2 Laplacian-of-Gaussian Features 29 3.2.3 Gabor Wavelet Features 29 3.2.4 Mode Selection for Image Super-Resolution 29 Chapter 4 Proposed Learning-Based Approach 32 4.1 Network Architecture 33 4.2 2D Discrete Wavelet Transform 34 4.3 Dense Connection 36 4.4 Model Training 37 4.4.1 Residual-Learning 37 4.4.2 Adjustable Gradient Clipping 38 4.4.3 Multi-Scale 39 4.5 Geometric Self-Ensemble Augmentation 39 Chapter 5 Experimental Results 41 5.1 Results of Weighting-Based Approach 41 5.2 Results of Learning-Based Approach 45 5.2.1 Datasets and Metrics 45 5.2.2 Implementation Details 45 5.2.3 Comparison with State-of-the-Art Methods 46 Chapter 6 Conclusion and Future Work 50 6.1 Conclusion 50 6.2 Future Work 50 REFERENCE 52
dc.language.iso	en
dc.title	基於加權算法與機器學習之影像超解析技術	zh_TW
dc.title	Super-Resolution Based on Advanced Weighting and Learning Techniques	en
dc.type	Thesis
dc.date.schoolyear	107-2
dc.description.degree	碩士
dc.contributor.oralexamcommittee	葉敏宏,王鵬華,余執彰
dc.subject.keyword	影像超解析,影像內插,卷積神經網路,小波轉換,	zh_TW
dc.subject.keyword	super-resolution,interpolation,convolutional neural network,wavelet transform,	en
dc.relation.page	56
dc.identifier.doi	10.6342/NTU201902088
dc.rights.note	未授權
dc.date.accepted	2019-08-01
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	電信工程學研究所	zh_TW
顯示於系所單位：	電信工程學研究所

文件中的檔案：

檔案	大小	格式
ntu-108-1.pdf 未授權公開取用	1.99 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。