請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/80098完整後設資料紀錄
| DC 欄位 | 值 | 語言 |
|---|---|---|
| dc.contributor.advisor | 吳沛遠(Pei-Yuan Wu) | |
| dc.contributor.author | Tsung-Shan Yang | en |
| dc.contributor.author | 楊宗山 | zh_TW |
| dc.date.accessioned | 2022-11-23T09:25:59Z | - |
| dc.date.available | 2021-11-05 | |
| dc.date.available | 2022-11-23T09:25:59Z | - |
| dc.date.copyright | 2021-11-05 | |
| dc.date.issued | 2021 | |
| dc.date.submitted | 2021-10-18 | |
| dc.identifier.citation | [1] Faq.md.https://github.com/jonas-koehler/s2cnn/blob/master/FAQ.md. [2] T. Cohen, M. Weiler, B. Kicanaoglu, and M. Welling. Gauge equivariant convolutional networks and the icosahedral cnn. In International Conference on Machine Learning, pages 1321–1330. PMLR, 2019. [3] T. S. Cohen, M. Geiger, J. Köhler, and M. Welling. Spherical cnns. arXiv preprint arXiv:1801.10130, 2018. [4] B. Coors, A. P. Condurache, and A. Geiger. Spherenet: Learning spherical representations for detection and classification in omnidirectional images. In Proceedings of the European Conference on Computer Vision (ECCV), pages 518–533, 2018. [5] J. B. Cordonnier, A. Loukas, and M. Jaggi. On the relationship between self-attention and convolutional layers. arXiv preprint arXiv:1911.03584, 2019. [6] J. Dai, H. Qi, Y. Xiong, Y. Li, G. Zhang, H. Hu, and Y. Wei. Deformable convolutional networks. In Proceedings of the IEEE international conference on computer vision, pages 764–773, 2017. [7] Z.Dai,Z.Yang,Y.Yang,J.Carbonell,Q.V.Le,andR.Salakhutdinov.Transformer-xl: Attentive language models beyond a fixed length context. arXiv preprint arXiv:1901.02860, 2019. [8] J. Devlin, M. W. Chang, K. Lee, and K. Toutanova. Bert: Pretraining of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018. [9] C. Esteves, C. Allen-Blanchette, A. Makadia, and K. Daniilidis. Learning so(3) equivariant representations with spherical cnns. In ECCV, 2018. [10] C. Fernandez-Labrador, J. M. Facil, A. PerezYus, C. Demonceaux, J. Civera, and J. J. Guerrero. Corners for layout: End-to-end layout recovery from 360 images. IEEE Robotics and Automation Letters, 5(2):1255–1262, 2020. [11] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016. [12] M. Kazhdan, T. Funkhouser, and S. Rusinkiewicz. Rotation invariant spherical harmonic representation of 3 d shape descriptors. In Symposium on geometry processing, volume 6, pages 156–164, 2003. [13] R. Khasanova and P. Frossard. Graph-based classification of omnidirectional images. In Proceedings of the IEEE International Conference on Computer Vision Workshops, pages 869–878, 2017. [14] B. Kicanaoglu, P. de Haan, and T. Cohen. Gauge equivariant spherical cnns. 2019. [15] J. Klicpera, J. Groß, and S. Günnemann. Directional message passing for molecular graphs. arXiv preprint arXiv:2003.03123, 2020. [16] A. Krizhevsky. Learning multiple layers of features from tiny images. 2009. [17] W. S. Lai, Y. Huang, N. Joshi, C. Buehler, M. H. Yang, and S. B. Kang. Semantic driven generation of hyperlapse from 360 degree video. IEEE transactions on visualization and computer graphics, 24(9):2610–2621, 2017. [18] Y. LeCun. The mnist database of handwritten digits. http://yann.lecun.com/exdb/mnist/, 1998. [19] Y. Lee, J. Jeong, J. Yun, W. Cho, and K. J. Yoon. Spherephd: Applying cnns on a spherical polyhedron representation of 360deg images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9181– 9189, 2019. [20] Matterport. https://matterport.com/. [21] D. Maturana and S. Scherer. Voxnet: A 3d convolutional neural network for real time object recognition. In 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 922–928. IEEE, 2015. [22] J. Park, S. Woo, J. Y. Lee, and I. S. Kweon. Bam: Bottleneck attention module. arXiv preprint arXiv:1807.06514, 2018. [23] C. R. Qi, H. Su, K. Mo, and L. J. Guibas. Pointnet: Deep learning on point sets for 3d classification and segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 652–660, 2017. [24] P. Ramachandran, N. Parmar, A. Vaswani, I. Bello, A. Levskaya, and J. Shlens. Stand-alone self-attention in vision models. arXiv preprint arXiv:1906.05909, 2019. [25] S. C. Schonsheck, B. Dong, and R. Lai. Parallel transport convolution: A new tool for convolutional neural networks on manifolds. arXiv preprint arXiv:1805.07857, 2018. [26] Z. Wu, S. Song, A. Khosla, F. Yu, L. Zhang, X. Tang, and J. Xiao. 3d shapenets: A deep representation for volumetric shapes. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1912–1920, 2015. [27] W. Yang, Y. Qian, J. K. Kämäräinen, F. Cricri, and L. Fan. Object detection in equirectangular panorama. In 2018 24th International Conference on Pattern Recognition (ICPR), pages 2190–2195. IEEE, 2018. [28] J. Yu, Z. Lin, J. Yang, X. Shen, X. Lu, and T. S. Huang. Free-form image inpainting with gated convolution. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 4471–4480, 2019. [29] C. Zhang, S. Liwicki, W. Smith, and R. Cipolla. Orientation-aware semantic segmentation on icosahedron spheres. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 3533–3541, 2019. [30] H. Zhao, J. Jia, and V. Koltun. Exploring self-attention for image recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10076–10085, 2020. | |
| dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/80098 | - |
| dc.description.abstract | "本篇基於編碼機制,以編碼的方式改變作用於平面影像的卷積核中的權重,使卷積在全景影像的特徵提取上能有較佳的表現,並且可以與現存的卷積類神經網路模塊兼容。 實驗結果以全景圖片分類的準卻度呈現了此編碼機制和卷積類神經網路及殘差模塊的相容性,並以 omni-MNIST, omni-CIFAR10, omni-CIFAR100 進行實驗,在準確度上得到目前最佳的結果。" | zh_TW |
| dc.description.provenance | Made available in DSpace on 2022-11-23T09:25:59Z (GMT). No. of bitstreams: 1 U0001-0807202123243900.pdf: 1338524 bytes, checksum: fe9e44d3060c780a2ee6175a256e1c4b (MD5) Previous issue date: 2021 | en |
| dc.description.tableofcontents | Verification Letter from the Oral Examination Committee i Acknowledgements iii 摘要 v Abstract vii Contents ix List of Figures xi List of Tables xiii Chapter 1 Introduction 1 Chapter 2 Related Work 5 2.1 CNNs Based on Different Projections . . . . . . . . . . . . . . . . . 5 2.2 Spherical CNNs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.3 Self-defined Representations and Kernels . . . . . . . . . . . . . . . 7 Chapter 3 Method 9 3.1 Convolution and SelfAttention on Feature Extraction . . . . . . . . 10 3.2 Encoding with Positional Information . . . . . . . . . . . . . . . . . 11 3.3 Spherical absolute encoding . . . . . . . . . . . . . . . . . . . . . . 12 3.4 Spherical relative encoding . . . . . . . . . . . . . . . . . . . . . . . 13 3.5 Encoding and Convolution . . . . . . . . . . . . . . . . . . . . . . . 15 Chapter 4 Experiment 17 4.1 Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 4.2 Classification on omniMNIST . . . . . . . . . . . . . . . . . . . . . 17 4.2.1 Experiment Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 4.2.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 4.3 Residual Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 4.3.1 Experiment Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 4.3.2 Result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 4.4 SelfAttention Model . . . . . . . . . . . . . . . . . . . . . . . . . . 21 4.4.1 Result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 Chapter 5 Conclusion 25 References 27 | |
| dc.language.iso | en | |
| dc.subject | 卷積 | zh_TW |
| dc.subject | 全景影像 | zh_TW |
| dc.subject | 編碼機制 | zh_TW |
| dc.subject | Omnidirectional | en |
| dc.subject | Encoding | en |
| dc.subject | Convolution | en |
| dc.title | 用於全景影像卷積操作之編碼機制 | zh_TW |
| dc.title | Omnidirectional Image Encoding | en |
| dc.date.schoolyear | 109-2 | |
| dc.description.degree | 碩士 | |
| dc.contributor.oralexamcommittee | 丁建均(Hsin-Tsai Liu),王鈺強(Chih-Yang Tseng) | |
| dc.subject.keyword | 全景影像,卷積,編碼機制, | zh_TW |
| dc.subject.keyword | Omnidirectional,Encoding,Convolution, | en |
| dc.relation.page | 30 | |
| dc.identifier.doi | 10.6342/NTU202101356 | |
| dc.rights.note | 同意授權(全球公開) | |
| dc.date.accepted | 2021-10-19 | |
| dc.contributor.author-college | 電機資訊學院 | zh_TW |
| dc.contributor.author-dept | 電信工程學研究所 | zh_TW |
| 顯示於系所單位: | 電信工程學研究所 | |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| U0001-0807202123243900.pdf | 1.31 MB | Adobe PDF | 檢視/開啟 |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
