使用小波轉換卷積神經網絡進行RGB-D影像深度訊息的可逆嵌入

鄭嘉賡; JIAGENG ZHENG

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/87285

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	莊永裕	zh_TW
dc.contributor.advisor	Yung-Yu Chuang	en
dc.contributor.author	鄭嘉賡	zh_TW
dc.contributor.author	JIAGENG ZHENG	en
dc.date.accessioned	2023-05-18T16:50:14Z	-
dc.date.available	2023-11-09	-
dc.date.copyright	2023-05-11	-
dc.date.issued	2023	-
dc.date.submitted	2023-02-17	-
dc.identifier.citation	[1] L. Ardizzone, J. Kruse, S. Wirkert, D. Rahner, E. W. Pellegrini, R. S. Klessen, L. Maier-Hein, C. Rother, and U. Köthe. Analyzing inverse problems with invertible neural networks. arXiv preprint arXiv:1808.04730, 2018. [2] K. L. Cheng, Y. Xie, and Q. Chen. Iicnet: A generic framework for reversible image conversion. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 1991–2000, 2021. [3] M. Courbariaux and Y. Bengio. Binarynet: Training deep neural networks with weights and activations constrained to +1 or -1. CoRR, abs/1602.02830, 2016. [4] I. J. Cox, J. Kilian, F. T. Leighton, and T. Shamoon. Secure spread spectrum watermarking for multimedia. IEEE transactions on image processing, 6(12):1673–1687, 1997. [5] E. Ganic and A. M. Eskicioglu. Robust dwt-svd domain image watermarking: embedding data in all frequencies. In Proceedings of the 2004 Workshop on Multimedia and Security, pages 166–174, 2004. [6] K. He, X. Zhang, S. Ren, and J. Sun. Delving deep into rectifiers: Surpassing humanlevel performance on imagenet classification. In Proceedings of the IEEE international conference on computer vision, pages 1026–1034, 2015. [7] G. E. Hinton and R. R. Salakhutdinov. Reducing the dimensionality of data with neural networks. science, 313(5786):504–507, 2006. [8] P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros. Image-to-image translation with conditional adversarial networks. CVPR, 2017. [9] J. Kim, M. Kim, H. Kang, and K. H. Lee. U-gat-it: Unsupervised generative attentional networks with adaptive layer-instance normalization for image-to-image translation. In International Conference on Learning Representations, 2020. [10] M. Klingner, J.-A. Termöhlen, J. Mikolajczyk, and T. Fingscheidt. Self supervised monocular depth estimation: Solving the dynamic object problem by semantic guidance. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XX 16, pages 582–600. Springer, 2020. [11] H.-Y. Lee, H.-Y. Tseng, J.-B. Huang, M. Singh, and M.-H. Yang. Diverse image toimage translation via disentangled representations. In Proceedings of the European conference on computer vision (ECCV), pages 35–51, 2018. [12] A. L. Maas, A. Y. Hannun, A. Y. Ng, et al. Rectifier nonlinearities improve neural network acoustic models. In Proc. icml, volume 30, page 3. Atlanta, Georgia, USA, 2013. [13] V. Nair and G. E. Hinton. Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th international conference on machine learning (ICML-10), pages 807–814, 2010. [14] P. K. Nathan Silberman, Derek Hoiem and R. Fergus. Indoor segmentation and support inference from rgbd images. In ECCV, 2012. [15] N. Nikolaidis and I. Pitas. Robust image watermarking in the spatial domain. Signal processing, 66(3):385–403, 1998. [16] H. Qin, R. Gong, X. Liu, X. Bai, J. Song, and N. Sebe. Binary neural networks: A survey. Pattern Recognition, 105:107281, 2020. [17] R. Ranftl, K. Lasinger, D. Hafner, K. Schindler, and V. Koltun. Towards robust monocular depth estimation: Mixing datasets for zero-shot cross-dataset transfer. IEEE transactions on pattern analysis and machine intelligence, 44(3):1623–1637, 2020. [18] O. Ronneberger, P. Fischer, and T. Brox. U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical image computing and computer-assisted intervention, pages 234–241. Springer, 2015. [19] R. Shin and D. Song. Jpeg-resistant adversarial images. In NIPS 2017 Workshop on Machine Learning and Computer Security, volume 1, page 8, 2017. [20] Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli. Image quality assessment: from error visibility to structural similarity. IEEE transactions on image processing, 13(4):600–612, 2004. [21] M. Xia, W. Hu, X. Liu, and T.-T. Wong. Deep halftoning with reversible binary pattern. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 14000–14009, 2021. [22] M. Xia, X. Liu, and T.-T. Wong. Invertible grayscale. ACM Transactions on Graphics (TOG), 37(6):1–10, 2018. [23] K. Xian, C. Shen, Z. Cao, H. Lu, Y. Xiao, R. Li, and Z. Luo. Monocular relative depth perception with web stereo data supervision. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018. [24] J.-Y. Zhu, T. Park, P. Isola, and A. A. Efros. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE international conference on computer vision, pages 2223–2232, 2017.	-
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/87285	-
dc.description.abstract	隨著各種深度攝影機與三維虛擬實境顯示裝置的普及，帶有深度資訊的三維影像也越來越廣泛地被使用。但是這種新型資料格式尚未有統一傳輸存儲格式，並且在傳統設備上也存在相容性問題。因此本論文提出了，一種全新的將三維圖片的深度資訊，嵌入在二維通道內並可解碼復原的方法。本文方法中的神經網絡骨幹，為卷積神經網路與離散小波轉換之結合。相較於傳統單獨使用卷積神經網路的編碼器與解碼器網路，有著佔用記憶體空間小，運算速度快，編解碼品質佳，這些優點。除此之外，對於可逆圖像轉換問題的端到端訓練中，遇到的量化誤差，本文提出了一種全新的可求導的量化函數，使得神經網路在測試時更少地受到量化誤差的影響。	zh_TW
dc.description.abstract	With the popularization of various depth cameras and 3D virtual reality monitors, 3D images with depth information are more and more widely used. However, there is no such unified format for these new types of data for transmission and storage. Moreover, using traditional devices may cause compatibility issues. Therefore, in this thesis, we propose a novel method of embedding the depth information of 3D images into RGB channels, which can be restored back. The backbone of the neural network in this method is a combination of a convolutional neural network and a discrete wavelet transform module. Compared with traditional encoder and decoder networks purely using convolution neural networks alone, the advantages of our method are less memory demanding, high computing speed, and better visual quality for encoding and decoding. Additionally, to solve the quantization error in the end-to-end training of the reversible image conversion problems, we propose a differentiable quantization function. So that the neural network is less influenced by quantization error in the testing stage.	en
dc.description.provenance	Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2023-05-18T16:50:14Z No. of bitstreams: 0	en
dc.description.provenance	Made available in DSpace on 2023-05-18T16:50:14Z (GMT). No. of bitstreams: 0	en
dc.description.tableofcontents	口試委員會審定書 i 誌謝 ii 摘要 iii Abstract iv Contents vi List of Figures viii List of Tables ix Chapter 1 Introduction 1 Chapter 2 Related Work 3 2.1 Reversible Work 3 2.1.1 Watermarking 3 2.1.2 Autoencoder 4 2.1.3 CycleGAN 4 2.1.4 Invertible Neural Networks 5 2.2 Depth Estimation 5 2.3 Quantization 5 Chapter 3 Proposed Method 7 3.1 Network Architecture 7 3.1.1 Encoder 8 3.1.2 Decoder 8 3.1.3 Haar Wavelet Transform 9 3.1.4 Backbone 9 3.1.5 Quantization Layer 10 3.2 Why external Wavelet Transform Module? 11 3.2.1 Stability(necessary condition of learnability) 11 3.2.2 Limited kernel size of CNN for infinite domain 11 3.2.3 Example 11 3.3 Loss Function 13 3.3.1 Pixel Loss 13 3.3.2 Zero-Mean Noise Loss 13 3.3.3 Structure Loss 13 3.3.4 Combined Loss 14 Chapter 4 Experimental Result 15 4.1 Dataset 15 4.2 Comparison to the baseline models 15 4.3 Performance 20 4.4 Ablation Study 20 Chapter 5 Conclusion 22 Bibliography 23	-
dc.language.iso	en	-
dc.subject	小波轉換	zh_TW
dc.subject	深度影像	zh_TW
dc.subject	可逆網路	zh_TW
dc.subject	卷積網路	zh_TW
dc.subject	Depth Image	en
dc.subject	CNN	en
dc.subject	Wavelet Transform	en
dc.subject	Reversible Network	en
dc.title	使用小波轉換卷積神經網絡進行RGB-D影像深度訊息的可逆嵌入	zh_TW
dc.title	Reversible Depth Embedding for RGB-D Images via Wavelet-CNN Network	en
dc.type	Thesis	-
dc.date.schoolyear	111-1	-
dc.description.degree	碩士	-
dc.contributor.oralexamcommittee	吳賦哲;葉正聖	zh_TW
dc.contributor.oralexamcommittee	Fu-Che Wu;Jeng-Sheng Yeh	en
dc.subject.keyword	深度影像,可逆網路,小波轉換,卷積網路,	zh_TW
dc.subject.keyword	Depth Image,Reversible Network,Wavelet Transform,CNN,	en
dc.relation.page	26	-
dc.identifier.doi	10.6342/NTU202300345	-
dc.rights.note	未授權	-
dc.date.accepted	2023-02-18	-
dc.contributor.author-college	電機資訊學院	-
dc.contributor.author-dept	資訊工程學系	-
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
ntu-111-1.pdf 未授權公開取用	7.05 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。