Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊工程學系
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/87285
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor莊永裕zh_TW
dc.contributor.advisorYung-Yu Chuangen
dc.contributor.author鄭嘉賡zh_TW
dc.contributor.authorJIAGENG ZHENGen
dc.date.accessioned2023-05-18T16:50:14Z-
dc.date.available2023-11-09-
dc.date.copyright2023-05-11-
dc.date.issued2023-
dc.date.submitted2023-02-17-
dc.identifier.citation[1] L. Ardizzone, J. Kruse, S. Wirkert, D. Rahner, E. W. Pellegrini, R. S. Klessen, L. Maier-Hein, C. Rother, and U. Köthe. Analyzing inverse problems with invertible neural networks. arXiv preprint arXiv:1808.04730, 2018.
[2] K. L. Cheng, Y. Xie, and Q. Chen. Iicnet: A generic framework for reversible image conversion. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 1991–2000, 2021.
[3] M. Courbariaux and Y. Bengio. Binarynet: Training deep neural networks with weights and activations constrained to +1 or -1. CoRR, abs/1602.02830, 2016.
[4] I. J. Cox, J. Kilian, F. T. Leighton, and T. Shamoon. Secure spread spectrum watermarking for multimedia. IEEE transactions on image processing, 6(12):1673–1687, 1997.
[5] E. Ganic and A. M. Eskicioglu. Robust dwt-svd domain image watermarking: embedding data in all frequencies. In Proceedings of the 2004 Workshop on Multimedia and Security, pages 166–174, 2004.
[6] K. He, X. Zhang, S. Ren, and J. Sun. Delving deep into rectifiers: Surpassing humanlevel performance on imagenet classification. In Proceedings of the IEEE international conference on computer vision, pages 1026–1034, 2015.
[7] G. E. Hinton and R. R. Salakhutdinov. Reducing the dimensionality of data with neural networks. science, 313(5786):504–507, 2006.
[8] P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros. Image-to-image translation with conditional adversarial networks. CVPR, 2017.
[9] J. Kim, M. Kim, H. Kang, and K. H. Lee. U-gat-it: Unsupervised generative attentional networks with adaptive layer-instance normalization for image-to-image translation. In International Conference on Learning Representations, 2020.
[10] M. Klingner, J.-A. Termöhlen, J. Mikolajczyk, and T. Fingscheidt. Self supervised monocular depth estimation: Solving the dynamic object problem by semantic guidance. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XX 16, pages 582–600. Springer, 2020.
[11] H.-Y. Lee, H.-Y. Tseng, J.-B. Huang, M. Singh, and M.-H. Yang. Diverse image toimage translation via disentangled representations. In Proceedings of the European conference on computer vision (ECCV), pages 35–51, 2018.
[12] A. L. Maas, A. Y. Hannun, A. Y. Ng, et al. Rectifier nonlinearities improve neural network acoustic models. In Proc. icml, volume 30, page 3. Atlanta, Georgia, USA, 2013.
[13] V. Nair and G. E. Hinton. Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th international conference on machine learning (ICML-10), pages 807–814, 2010.
[14] P. K. Nathan Silberman, Derek Hoiem and R. Fergus. Indoor segmentation and support inference from rgbd images. In ECCV, 2012.
[15] N. Nikolaidis and I. Pitas. Robust image watermarking in the spatial domain. Signal processing, 66(3):385–403, 1998.
[16] H. Qin, R. Gong, X. Liu, X. Bai, J. Song, and N. Sebe. Binary neural networks: A survey. Pattern Recognition, 105:107281, 2020.
[17] R. Ranftl, K. Lasinger, D. Hafner, K. Schindler, and V. Koltun. Towards robust monocular depth estimation: Mixing datasets for zero-shot cross-dataset transfer. IEEE transactions on pattern analysis and machine intelligence, 44(3):1623–1637, 2020.
[18] O. Ronneberger, P. Fischer, and T. Brox. U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical image computing and computer-assisted intervention, pages 234–241. Springer, 2015.
[19] R. Shin and D. Song. Jpeg-resistant adversarial images. In NIPS 2017 Workshop on Machine Learning and Computer Security, volume 1, page 8, 2017.
[20] Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli. Image quality assessment: from error visibility to structural similarity. IEEE transactions on image processing,
13(4):600–612, 2004.
[21] M. Xia, W. Hu, X. Liu, and T.-T. Wong. Deep halftoning with reversible binary pattern. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 14000–14009, 2021.
[22] M. Xia, X. Liu, and T.-T. Wong. Invertible grayscale. ACM Transactions on Graphics (TOG), 37(6):1–10, 2018.
[23] K. Xian, C. Shen, Z. Cao, H. Lu, Y. Xiao, R. Li, and Z. Luo. Monocular relative depth perception with web stereo data supervision. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018.
[24] J.-Y. Zhu, T. Park, P. Isola, and A. A. Efros. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE international conference on computer vision, pages 2223–2232, 2017.
-
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/87285-
dc.description.abstract隨著各種深度攝影機與三維虛擬實境顯示裝置的普及,帶有深度資訊的三維 影像也越來越廣泛地被使用。但是這種新型資料格式尚未有統一傳輸存儲格式, 並且在傳統設備上也存在相容性問題。
因此本論文提出了,一種全新的將三維圖片的深度資訊,嵌入在二維通道內 並可解碼復原的方法。本文方法中的神經網絡骨幹,為卷積神經網路與離散小波 轉換之結合。
相較於傳統單獨使用卷積神經網路的編碼器與解碼器網路,有著佔用記憶體 空間小,運算速度快,編解碼品質佳,這些優點。除此之外,對於可逆圖像轉換 問題的端到端訓練中,遇到的量化誤差,本文提出了一種全新的可求導的量化函 數,使得神經網路在測試時更少地受到量化誤差的影響。
zh_TW
dc.description.abstractWith the popularization of various depth cameras and 3D virtual reality monitors, 3D images with depth information are more and more widely used. However, there is no such unified format for these new types of data for transmission and storage. Moreover, using traditional devices may cause compatibility issues.
Therefore, in this thesis, we propose a novel method of embedding the depth information of 3D images into RGB channels, which can be restored back. The backbone of the neural network in this method is a combination of a convolutional neural network and a discrete wavelet transform module.
Compared with traditional encoder and decoder networks purely using convolution neural networks alone, the advantages of our method are less memory demanding, high computing speed, and better visual quality for encoding and decoding. Additionally, to solve the quantization error in the end-to-end training of the reversible image conversion problems, we propose a differentiable quantization function. So that the neural network is less influenced by quantization error in the testing stage.
en
dc.description.provenanceSubmitted by admin ntu (admin@lib.ntu.edu.tw) on 2023-05-18T16:50:14Z
No. of bitstreams: 0
en
dc.description.provenanceMade available in DSpace on 2023-05-18T16:50:14Z (GMT). No. of bitstreams: 0en
dc.description.tableofcontents口試委員會審定書 i
誌謝 ii
摘要 iii
Abstract iv
Contents vi
List of Figures viii
List of Tables ix
Chapter 1 Introduction 1
Chapter 2 Related Work 3
2.1 Reversible Work 3
2.1.1 Watermarking 3
2.1.2 Autoencoder 4
2.1.3 CycleGAN 4
2.1.4 Invertible Neural Networks 5
2.2 Depth Estimation 5
2.3 Quantization 5
Chapter 3 Proposed Method 7
3.1 Network Architecture 7
3.1.1 Encoder 8
3.1.2 Decoder 8
3.1.3 Haar Wavelet Transform 9
3.1.4 Backbone 9
3.1.5 Quantization Layer 10
3.2 Why external Wavelet Transform Module? 11
3.2.1 Stability(necessary condition of learnability) 11
3.2.2 Limited kernel size of CNN for infinite domain 11
3.2.3 Example 11
3.3 Loss Function 13
3.3.1 Pixel Loss 13
3.3.2 Zero-Mean Noise Loss 13
3.3.3 Structure Loss 13
3.3.4 Combined Loss 14
Chapter 4 Experimental Result 15
4.1 Dataset 15
4.2 Comparison to the baseline models 15
4.3 Performance 20
4.4 Ablation Study 20
Chapter 5 Conclusion 22
Bibliography 23
-
dc.language.isoen-
dc.subject小波轉換zh_TW
dc.subject深度影像zh_TW
dc.subject可逆網路zh_TW
dc.subject卷積網路zh_TW
dc.subjectDepth Imageen
dc.subjectCNNen
dc.subjectWavelet Transformen
dc.subjectReversible Networken
dc.title使用小波轉換卷積神經網絡進行RGB-D影像深度訊息的可逆嵌入zh_TW
dc.titleReversible Depth Embedding for RGB-D Images via Wavelet-CNN Networken
dc.typeThesis-
dc.date.schoolyear111-1-
dc.description.degree碩士-
dc.contributor.oralexamcommittee吳賦哲;葉正聖zh_TW
dc.contributor.oralexamcommitteeFu-Che Wu;Jeng-Sheng Yehen
dc.subject.keyword深度影像,可逆網路,小波轉換,卷積網路,zh_TW
dc.subject.keywordDepth Image,Reversible Network,Wavelet Transform,CNN,en
dc.relation.page26-
dc.identifier.doi10.6342/NTU202300345-
dc.rights.note未授權-
dc.date.accepted2023-02-18-
dc.contributor.author-college電機資訊學院-
dc.contributor.author-dept資訊工程學系-
顯示於系所單位:資訊工程學系

文件中的檔案:
檔案 大小格式 
ntu-111-1.pdf
  未授權公開取用
7.05 MBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved