請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/87285完整後設資料紀錄
| DC 欄位 | 值 | 語言 |
|---|---|---|
| dc.contributor.advisor | 莊永裕 | zh_TW |
| dc.contributor.advisor | Yung-Yu Chuang | en |
| dc.contributor.author | 鄭嘉賡 | zh_TW |
| dc.contributor.author | JIAGENG ZHENG | en |
| dc.date.accessioned | 2023-05-18T16:50:14Z | - |
| dc.date.available | 2023-11-09 | - |
| dc.date.copyright | 2023-05-11 | - |
| dc.date.issued | 2023 | - |
| dc.date.submitted | 2023-02-17 | - |
| dc.identifier.citation | [1] L. Ardizzone, J. Kruse, S. Wirkert, D. Rahner, E. W. Pellegrini, R. S. Klessen, L. Maier-Hein, C. Rother, and U. Köthe. Analyzing inverse problems with invertible neural networks. arXiv preprint arXiv:1808.04730, 2018.
[2] K. L. Cheng, Y. Xie, and Q. Chen. Iicnet: A generic framework for reversible image conversion. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 1991–2000, 2021. [3] M. Courbariaux and Y. Bengio. Binarynet: Training deep neural networks with weights and activations constrained to +1 or -1. CoRR, abs/1602.02830, 2016. [4] I. J. Cox, J. Kilian, F. T. Leighton, and T. Shamoon. Secure spread spectrum watermarking for multimedia. IEEE transactions on image processing, 6(12):1673–1687, 1997. [5] E. Ganic and A. M. Eskicioglu. Robust dwt-svd domain image watermarking: embedding data in all frequencies. In Proceedings of the 2004 Workshop on Multimedia and Security, pages 166–174, 2004. [6] K. He, X. Zhang, S. Ren, and J. Sun. Delving deep into rectifiers: Surpassing humanlevel performance on imagenet classification. In Proceedings of the IEEE international conference on computer vision, pages 1026–1034, 2015. [7] G. E. Hinton and R. R. Salakhutdinov. Reducing the dimensionality of data with neural networks. science, 313(5786):504–507, 2006. [8] P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros. Image-to-image translation with conditional adversarial networks. CVPR, 2017. [9] J. Kim, M. Kim, H. Kang, and K. H. Lee. U-gat-it: Unsupervised generative attentional networks with adaptive layer-instance normalization for image-to-image translation. In International Conference on Learning Representations, 2020. [10] M. Klingner, J.-A. Termöhlen, J. Mikolajczyk, and T. Fingscheidt. Self supervised monocular depth estimation: Solving the dynamic object problem by semantic guidance. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XX 16, pages 582–600. Springer, 2020. [11] H.-Y. Lee, H.-Y. Tseng, J.-B. Huang, M. Singh, and M.-H. Yang. Diverse image toimage translation via disentangled representations. In Proceedings of the European conference on computer vision (ECCV), pages 35–51, 2018. [12] A. L. Maas, A. Y. Hannun, A. Y. Ng, et al. Rectifier nonlinearities improve neural network acoustic models. In Proc. icml, volume 30, page 3. Atlanta, Georgia, USA, 2013. [13] V. Nair and G. E. Hinton. Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th international conference on machine learning (ICML-10), pages 807–814, 2010. [14] P. K. Nathan Silberman, Derek Hoiem and R. Fergus. Indoor segmentation and support inference from rgbd images. In ECCV, 2012. [15] N. Nikolaidis and I. Pitas. Robust image watermarking in the spatial domain. Signal processing, 66(3):385–403, 1998. [16] H. Qin, R. Gong, X. Liu, X. Bai, J. Song, and N. Sebe. Binary neural networks: A survey. Pattern Recognition, 105:107281, 2020. [17] R. Ranftl, K. Lasinger, D. Hafner, K. Schindler, and V. Koltun. Towards robust monocular depth estimation: Mixing datasets for zero-shot cross-dataset transfer. IEEE transactions on pattern analysis and machine intelligence, 44(3):1623–1637, 2020. [18] O. Ronneberger, P. Fischer, and T. Brox. U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical image computing and computer-assisted intervention, pages 234–241. Springer, 2015. [19] R. Shin and D. Song. Jpeg-resistant adversarial images. In NIPS 2017 Workshop on Machine Learning and Computer Security, volume 1, page 8, 2017. [20] Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli. Image quality assessment: from error visibility to structural similarity. IEEE transactions on image processing, 13(4):600–612, 2004. [21] M. Xia, W. Hu, X. Liu, and T.-T. Wong. Deep halftoning with reversible binary pattern. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 14000–14009, 2021. [22] M. Xia, X. Liu, and T.-T. Wong. Invertible grayscale. ACM Transactions on Graphics (TOG), 37(6):1–10, 2018. [23] K. Xian, C. Shen, Z. Cao, H. Lu, Y. Xiao, R. Li, and Z. Luo. Monocular relative depth perception with web stereo data supervision. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018. [24] J.-Y. Zhu, T. Park, P. Isola, and A. A. Efros. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE international conference on computer vision, pages 2223–2232, 2017. | - |
| dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/87285 | - |
| dc.description.abstract | 隨著各種深度攝影機與三維虛擬實境顯示裝置的普及,帶有深度資訊的三維 影像也越來越廣泛地被使用。但是這種新型資料格式尚未有統一傳輸存儲格式, 並且在傳統設備上也存在相容性問題。
因此本論文提出了,一種全新的將三維圖片的深度資訊,嵌入在二維通道內 並可解碼復原的方法。本文方法中的神經網絡骨幹,為卷積神經網路與離散小波 轉換之結合。 相較於傳統單獨使用卷積神經網路的編碼器與解碼器網路,有著佔用記憶體 空間小,運算速度快,編解碼品質佳,這些優點。除此之外,對於可逆圖像轉換 問題的端到端訓練中,遇到的量化誤差,本文提出了一種全新的可求導的量化函 數,使得神經網路在測試時更少地受到量化誤差的影響。 | zh_TW |
| dc.description.abstract | With the popularization of various depth cameras and 3D virtual reality monitors, 3D images with depth information are more and more widely used. However, there is no such unified format for these new types of data for transmission and storage. Moreover, using traditional devices may cause compatibility issues.
Therefore, in this thesis, we propose a novel method of embedding the depth information of 3D images into RGB channels, which can be restored back. The backbone of the neural network in this method is a combination of a convolutional neural network and a discrete wavelet transform module. Compared with traditional encoder and decoder networks purely using convolution neural networks alone, the advantages of our method are less memory demanding, high computing speed, and better visual quality for encoding and decoding. Additionally, to solve the quantization error in the end-to-end training of the reversible image conversion problems, we propose a differentiable quantization function. So that the neural network is less influenced by quantization error in the testing stage. | en |
| dc.description.provenance | Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2023-05-18T16:50:14Z No. of bitstreams: 0 | en |
| dc.description.provenance | Made available in DSpace on 2023-05-18T16:50:14Z (GMT). No. of bitstreams: 0 | en |
| dc.description.tableofcontents | 口試委員會審定書 i
誌謝 ii 摘要 iii Abstract iv Contents vi List of Figures viii List of Tables ix Chapter 1 Introduction 1 Chapter 2 Related Work 3 2.1 Reversible Work 3 2.1.1 Watermarking 3 2.1.2 Autoencoder 4 2.1.3 CycleGAN 4 2.1.4 Invertible Neural Networks 5 2.2 Depth Estimation 5 2.3 Quantization 5 Chapter 3 Proposed Method 7 3.1 Network Architecture 7 3.1.1 Encoder 8 3.1.2 Decoder 8 3.1.3 Haar Wavelet Transform 9 3.1.4 Backbone 9 3.1.5 Quantization Layer 10 3.2 Why external Wavelet Transform Module? 11 3.2.1 Stability(necessary condition of learnability) 11 3.2.2 Limited kernel size of CNN for infinite domain 11 3.2.3 Example 11 3.3 Loss Function 13 3.3.1 Pixel Loss 13 3.3.2 Zero-Mean Noise Loss 13 3.3.3 Structure Loss 13 3.3.4 Combined Loss 14 Chapter 4 Experimental Result 15 4.1 Dataset 15 4.2 Comparison to the baseline models 15 4.3 Performance 20 4.4 Ablation Study 20 Chapter 5 Conclusion 22 Bibliography 23 | - |
| dc.language.iso | en | - |
| dc.subject | 小波轉換 | zh_TW |
| dc.subject | 深度影像 | zh_TW |
| dc.subject | 可逆網路 | zh_TW |
| dc.subject | 卷積網路 | zh_TW |
| dc.subject | Depth Image | en |
| dc.subject | CNN | en |
| dc.subject | Wavelet Transform | en |
| dc.subject | Reversible Network | en |
| dc.title | 使用小波轉換卷積神經網絡進行RGB-D影像深度訊息的可逆嵌入 | zh_TW |
| dc.title | Reversible Depth Embedding for RGB-D Images via Wavelet-CNN Network | en |
| dc.type | Thesis | - |
| dc.date.schoolyear | 111-1 | - |
| dc.description.degree | 碩士 | - |
| dc.contributor.oralexamcommittee | 吳賦哲;葉正聖 | zh_TW |
| dc.contributor.oralexamcommittee | Fu-Che Wu;Jeng-Sheng Yeh | en |
| dc.subject.keyword | 深度影像,可逆網路,小波轉換,卷積網路, | zh_TW |
| dc.subject.keyword | Depth Image,Reversible Network,Wavelet Transform,CNN, | en |
| dc.relation.page | 26 | - |
| dc.identifier.doi | 10.6342/NTU202300345 | - |
| dc.rights.note | 未授權 | - |
| dc.date.accepted | 2023-02-18 | - |
| dc.contributor.author-college | 電機資訊學院 | - |
| dc.contributor.author-dept | 資訊工程學系 | - |
| 顯示於系所單位: | 資訊工程學系 | |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| ntu-111-1.pdf 未授權公開取用 | 7.05 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
