Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 電信工程學研究所
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/80098
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor吳沛遠(Pei-Yuan Wu)
dc.contributor.authorTsung-Shan Yangen
dc.contributor.author楊宗山zh_TW
dc.date.accessioned2022-11-23T09:25:59Z-
dc.date.available2021-11-05
dc.date.available2022-11-23T09:25:59Z-
dc.date.copyright2021-11-05
dc.date.issued2021
dc.date.submitted2021-10-18
dc.identifier.citation[1] Faq.md.https://github.com/jonas-koehler/s2cnn/blob/master/FAQ.md. [2] T. Cohen, M. Weiler, B. Kicanaoglu, and M. Welling. Gauge equivariant convolutional networks and the icosahedral cnn. In International Conference on Machine Learning, pages 1321–1330. PMLR, 2019. [3] T. S. Cohen, M. Geiger, J. Köhler, and M. Welling. Spherical cnns. arXiv preprint arXiv:1801.10130, 2018. [4] B. Coors, A. P. Condurache, and A. Geiger. Spherenet: Learning spherical representations for detection and classification in omnidirectional images. In Proceedings of the European Conference on Computer Vision (ECCV), pages 518–533, 2018. [5] J. B. Cordonnier, A. Loukas, and M. Jaggi. On the relationship between self-attention and convolutional layers. arXiv preprint arXiv:1911.03584, 2019. [6] J. Dai, H. Qi, Y. Xiong, Y. Li, G. Zhang, H. Hu, and Y. Wei. Deformable convolutional networks. In Proceedings of the IEEE international conference on computer vision, pages 764–773, 2017. [7] Z.Dai,Z.Yang,Y.Yang,J.Carbonell,Q.V.Le,andR.Salakhutdinov.Transformer-xl: Attentive language models beyond a fixed length context. arXiv preprint arXiv:1901.02860, 2019. [8] J. Devlin, M. W. Chang, K. Lee, and K. Toutanova. Bert: Pretraining of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018. [9] C. Esteves, C. Allen-Blanchette, A. Makadia, and K. Daniilidis. Learning so(3) equivariant representations with spherical cnns. In ECCV, 2018. [10] C. Fernandez-Labrador, J. M. Facil, A. PerezYus, C. Demonceaux, J. Civera, and J. J. Guerrero. Corners for layout: End-to-end layout recovery from 360 images. IEEE Robotics and Automation Letters, 5(2):1255–1262, 2020. [11] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016. [12] M. Kazhdan, T. Funkhouser, and S. Rusinkiewicz. Rotation invariant spherical harmonic representation of 3 d shape descriptors. In Symposium on geometry processing, volume 6, pages 156–164, 2003. [13] R. Khasanova and P. Frossard. Graph-based classification of omnidirectional images. In Proceedings of the IEEE International Conference on Computer Vision Workshops, pages 869–878, 2017. [14] B. Kicanaoglu, P. de Haan, and T. Cohen. Gauge equivariant spherical cnns. 2019. [15] J. Klicpera, J. Groß, and S. Günnemann. Directional message passing for molecular graphs. arXiv preprint arXiv:2003.03123, 2020. [16] A. Krizhevsky. Learning multiple layers of features from tiny images. 2009. [17] W. S. Lai, Y. Huang, N. Joshi, C. Buehler, M. H. Yang, and S. B. Kang. Semantic driven generation of hyperlapse from 360 degree video. IEEE transactions on visualization and computer graphics, 24(9):2610–2621, 2017. [18] Y. LeCun. The mnist database of handwritten digits. http://yann.lecun.com/exdb/mnist/, 1998. [19] Y. Lee, J. Jeong, J. Yun, W. Cho, and K. J. Yoon. Spherephd: Applying cnns on a spherical polyhedron representation of 360deg images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9181– 9189, 2019. [20] Matterport. https://matterport.com/. [21] D. Maturana and S. Scherer. Voxnet: A 3d convolutional neural network for real time object recognition. In 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 922–928. IEEE, 2015. [22] J. Park, S. Woo, J. Y. Lee, and I. S. Kweon. Bam: Bottleneck attention module. arXiv preprint arXiv:1807.06514, 2018. [23] C. R. Qi, H. Su, K. Mo, and L. J. Guibas. Pointnet: Deep learning on point sets for 3d classification and segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 652–660, 2017. [24] P. Ramachandran, N. Parmar, A. Vaswani, I. Bello, A. Levskaya, and J. Shlens. Stand-alone self-attention in vision models. arXiv preprint arXiv:1906.05909, 2019. [25] S. C. Schonsheck, B. Dong, and R. Lai. Parallel transport convolution: A new tool for convolutional neural networks on manifolds. arXiv preprint arXiv:1805.07857, 2018. [26] Z. Wu, S. Song, A. Khosla, F. Yu, L. Zhang, X. Tang, and J. Xiao. 3d shapenets: A deep representation for volumetric shapes. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1912–1920, 2015. [27] W. Yang, Y. Qian, J. K. Kämäräinen, F. Cricri, and L. Fan. Object detection in equirectangular panorama. In 2018 24th International Conference on Pattern Recognition (ICPR), pages 2190–2195. IEEE, 2018. [28] J. Yu, Z. Lin, J. Yang, X. Shen, X. Lu, and T. S. Huang. Free-form image inpainting with gated convolution. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 4471–4480, 2019. [29] C. Zhang, S. Liwicki, W. Smith, and R. Cipolla. Orientation-aware semantic segmentation on icosahedron spheres. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 3533–3541, 2019. [30] H. Zhao, J. Jia, and V. Koltun. Exploring self-attention for image recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10076–10085, 2020.
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/80098-
dc.description.abstract"本篇基於編碼機制,以編碼的方式改變作用於平面影像的卷積核中的權重,使卷積在全景影像的特徵提取上能有較佳的表現,並且可以與現存的卷積類神經網路模塊兼容。 實驗結果以全景圖片分類的準卻度呈現了此編碼機制和卷積類神經網路及殘差模塊的相容性,並以 omni-MNIST, omni-CIFAR10, omni-CIFAR100 進行實驗,在準確度上得到目前最佳的結果。"zh_TW
dc.description.provenanceMade available in DSpace on 2022-11-23T09:25:59Z (GMT). No. of bitstreams: 1
U0001-0807202123243900.pdf: 1338524 bytes, checksum: fe9e44d3060c780a2ee6175a256e1c4b (MD5)
Previous issue date: 2021
en
dc.description.tableofcontentsVerification Letter from the Oral Examination Committee i Acknowledgements iii 摘要 v Abstract vii Contents ix List of Figures xi List of Tables xiii Chapter 1 Introduction 1 Chapter 2 Related Work 5 2.1 CNNs Based on Different Projections . . . . . . . . . . . . . . . . . 5 2.2 Spherical CNNs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.3 Self­-defined Representations and Kernels . . . . . . . . . . . . . . . 7 Chapter 3 Method 9 3.1 Convolution and Self­Attention on Feature Extraction . . . . . . . . 10 3.2 Encoding with Positional Information . . . . . . . . . . . . . . . . . 11 3.3 Spherical absolute encoding . . . . . . . . . . . . . . . . . . . . . . 12 3.4 Spherical relative encoding . . . . . . . . . . . . . . . . . . . . . . . 13 3.5 Encoding and Convolution . . . . . . . . . . . . . . . . . . . . . . . 15 Chapter 4 Experiment 17 4.1 Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 4.2 Classification on omniMNIST . . . . . . . . . . . . . . . . . . . . . 17 4.2.1 Experiment Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 4.2.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 4.3 Residual Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 4.3.1 Experiment Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 4.3.2 Result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 4.4 Self­Attention Model . . . . . . . . . . . . . . . . . . . . . . . . . . 21 4.4.1 Result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 Chapter 5 Conclusion 25 References 27
dc.language.isoen
dc.subject卷積zh_TW
dc.subject全景影像zh_TW
dc.subject編碼機制zh_TW
dc.subjectOmnidirectionalen
dc.subjectEncodingen
dc.subjectConvolutionen
dc.title用於全景影像卷積操作之編碼機制zh_TW
dc.titleOmnidirectional Image Encodingen
dc.date.schoolyear109-2
dc.description.degree碩士
dc.contributor.oralexamcommittee丁建均(Hsin-Tsai Liu),王鈺強(Chih-Yang Tseng)
dc.subject.keyword全景影像,卷積,編碼機制,zh_TW
dc.subject.keywordOmnidirectional,Encoding,Convolution,en
dc.relation.page30
dc.identifier.doi10.6342/NTU202101356
dc.rights.note同意授權(全球公開)
dc.date.accepted2021-10-19
dc.contributor.author-college電機資訊學院zh_TW
dc.contributor.author-dept電信工程學研究所zh_TW
顯示於系所單位:電信工程學研究所

文件中的檔案:
檔案 大小格式 
U0001-0807202123243900.pdf1.31 MBAdobe PDF檢視/開啟
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved