應用於三維點雲分類任務之注意力機制及神經網路搜索架構

Yen-Po Lin; 林彥伯

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/78640

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	盧奕璋(Yi-Chang Lu)
dc.contributor.author	Yen-Po Lin	en
dc.contributor.author	林彥伯	zh_TW
dc.date.accessioned	2021-07-11T15:09:08Z	-
dc.date.available	2023-10-31
dc.date.copyright	2020-11-13
dc.date.issued	2020
dc.date.submitted	2020-11-09
dc.identifier.citation	[1] M. Atzmon, H. Maron, and Y. Lipman. Point convolutional neural networks by extension operators. arXiv preprint arXiv:1803.10091, 2018. [2] Y. Ben-Shabat, M. Lindenbaum, and A. Fischer. 3dmfv: Three-dimensional point cloud classification in real-time using convolutional neural networks. IEEE Robotics and Automation Letters, 3(4):3145–3152, 2018. [3] X. Chen, L. Xie, J. Wu, and Q. Tian. Progressive differentiable architecture search: Bridging the depth gap between search and evaluation. In Proceedings of the IEEE International Conference on Computer Vision, pages 1294–1303, 2019. [4] Z. Gao, L. Wang, and G. Wu. Lip: Local importance-based pooling. In Proceedings of the IEEE International Conference on Computer Vision, pages 3355–3364, 2019. [5] K.He,X.Zhang,S.Ren,andJ.Sun.Deepresiduallearningforimagerecognition.In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 770–778, 2016. [6] P. Hermosilla, T. Ritschel, P.-P. Vázquez, À. Vinacua, and T. Ropinski. Monte carlo convolution for learning on non-uniformly sampled point clouds. ACM Transactions on Graphics, 37(6):1–12, 2018. [7] J. Hu, L. Shen, and G. Sun. Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 7132–7141, 2018. [8] G. Huang, Z. Liu, L. Van Der Maaten, and K. Q. Weinberger. Densely connected convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 4700–4708, 2017. [9] S. Ioffe and C. Szegedy. Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167, 2015. [10] M. Jaderberg, K. Simonyan, A. Zisserman, et al. Spatial transformer networks. In Advances in Neural Information Processing Systems, pages 2017–2025, 2015. [11] R. Klokov and V. Lempitsky. Escape from cells: Deep kd-networks for the recogni- tion of 3d point cloud models. In Proceedings of the IEEE International Conference on Computer Vision, pages 863–872, 2017. [12] A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Sys- tems, pages 1097–1105, 2012. [13] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, 1998. [14] G. Li, M. Müller, G. Qian, I. C. Delgadillo, A. Abualshour, A. Thabet, and B. Ghanem. Deepgcns: Making gcns go as deep as cnns. arXiv preprint arXiv:1910.06849, 2019. [15] G. Li, G. Qian, I. C. Delgadillo, M. Muller, A. Thabet, and B. Ghanem. Sgas: Se- quential greedy architecture search. In Proceedings of the IEEE Conference on Com- puter Vision and Pattern Recognition, pages 1620–1630, 2020. [16] J. Li, B. M. Chen, and G. Hee Lee. So-net: Self-organizing network for point cloud analysis. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 9397–9406, 2018. [17] Y. Li, R. Bu, M. Sun, W. Wu, X. Di, and B. Chen. Pointcnn: Convolution on x- transformed points. In Advances in Neural Information Processing Systems, pages 820–830, 2018. [18] Y. Li, S. Pirk, H. Su, C. R. Qi, and L. J. Guibas. Fpnn: Field probing neural networks for 3d data. In Advances in Neural Information Processing Systems, pages 307–315, 2016. [19] H. Liu, K. Simonyan, and Y. Yang. Darts: Differentiable architecture search. arXiv preprint arXiv:1806.09055, 2018. [20] Y. Liu, B. Fan, G. Meng, J. Lu, S. Xiang, and C. Pan. Densepoint: Learning densely contextual representation for efficient point cloud processing. In Proceedings of the IEEE International Conference on Computer Vision, pages 5239–5248, 2019. [21] Y. Liu, B. Fan, S. Xiang, and C. Pan. Relation-shape convolutional neural network for point cloud analysis. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 8895–8904, 2019. [22] I. Loshchilov and F. Hutter. Sgdr: Stochastic gradient descent with warm restarts. arXiv preprint arXiv:1608.03983, 2016. [23] D. Maturana and S. Scherer. Voxnet: A 3d convolutional neural network for real- time object recognition. In 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 922–928. IEEE. [24] A.Paszke,S.Gross,S.Chintala,G.Chanan,E.Yang,Z.DeVito,Z.Lin,A.Desmai- son, L. Antiga, and A. Lerer. Automatic differentiation in pytorch. 2017. [25] C. R. Qi, H. Su, K. Mo, and L. J. Guibas. Pointnet: Deep learning on point sets for 3d classification and segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 652–660, 2017. [26] C. R. Qi, L. Yi, H. Su, and L. J. Guibas. Pointnet++: Deep hierarchical feature learn- ing on point sets in a metric space. In Advances in Neural Information Processing Systems, pages 5099–5108, 2017. [27] E.Real,S.Moore,A.Selle,S.Saxena,Y.L.Suematsu,J.Tan,Q.Le,andA.Kurakin. Large-scale evolution of image classifiers. arXiv preprint arXiv:1703.01041, 2017. [28] K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014. [29] R. K. Srivastava, K. Greff, and J. Schmidhuber. Training very deep networks. In Advances in Neural Information Processing Systems, pages 2377–2385, 2015. [30] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Van- houcke, and A. Rabinovich. Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 1–9, 2015. [31] H. Thomas, C. R. Qi, J.-E. Deschaud, B. Marcotegui, F. Goulette, and L. J. Guibas. Kpconv: Flexible and deformable convolution for point clouds. In Proceedings of the IEEE International Conference on Computer Vision, pages 6411–6420, 2019. [32] M. A. Uy, Q.-H. Pham, B.-S. Hua, T. Nguyen, and S.-K. Yeung. Revisiting point cloud classification: A new benchmark dataset and classification model on real- world data. In Proceedings of the IEEE International Conference on Computer Vi- sion, pages 1588–1597, 2019. [33] P.-S. Wang, Y. Liu, Y.-X. Guo, C.-Y. Sun, and X. Tong. O-cnn: Octree-based con- volutional neural networks for 3d shape analysis. ACM Transactions on Graphics, 36(4):1–11, 2017. [34] Y.Wang,Y.Sun,Z.Liu,S.E.Sarma,M.M.Bronstein,andJ.M.Solomon.Dynamic graph cnn for learning on point clouds. ACM Transactions on Graphics, 38(5):1–12, 2019. [35] S. Woo, J. Park, J.-Y. Lee, and I. So Kweon. Cbam: Convolutional block attention module. In Proceedings of the European Conference on Computer Vision, pages 3–19, 2018. [36] Z. Wu, S. Song, A. Khosla, F. Yu, L. Zhang, X. Tang, and J. Xiao. 3d shapenets: A deep representation for volumetric shapes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 1912–1920, 2015. [37] Y. Xu, T. Fan, M. Xu, L. Zeng, and Y. Qiao. Spidercnn: Deep learning on point sets with parameterized convolutional filters. In Proceedings of the European Conference on Computer Vision, pages 87–102, 2018. [38] Y. Xu, L. Xie, X. Zhang, X. Chen, G.-J. Qi, Q. Tian, and H. Xiong. Pc-darts: Partial channel connections for memory-efficient differentiable architecture search. arXiv preprint arXiv:1907.05737, 2019. [39] F. Yu and V. Koltun. Multi-scale context aggregation by dilated convolutions. arXiv preprint arXiv:1511.07122, 2015.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/78640	-
dc.description.abstract	近幾年來，隨著對自動駕駛技術的投入，三維點雲的研究也隨之蓬勃發展。其中由於 3D 點雲有著不規則以及無順序的特性，因此要抓取點與點之間的幾何特徵是非常困難的。本論文提出了 3 種方法來改善抓取點雲特徵的能力，進而提升點雲分類任務的正確及穩定度。在第一個方法中我們引入了 2 種不同面向的注意力機制，分別為用來決定點與點之間關聯性大小的點注意力模組 (Point-wise Attention Module) 以及讓模型在有限資源下更專注於重要特徵的通道注意力模組 (Channel-wise Attention Module)。採用了此方法後，本論文不只在 ModelNet40 資料集上達到了最先進的正確率 93.7%，在 ScanObjectNN 資料集上的錯誤率相比於 DGCNN 也減少了 2.96% ~ 7.49%。第二個方法則是動態 K 值調整 (Dynamic K)，我們藉由動態調整 K-近鄰演算法 (KNN) 的大小來改善在面對低解析度物體時的正確率。有了這個方法後，我們在面對低解析度物體時，正確率有著 2.4% ~ 434.7% 增長。最後第三種方法我們利用了神經網路搜索 (NAS) 的技術來找出更適合點雲分類任務的架構。經由實驗結果證明，神經網路搜索 (NAS) 的方法確實能帶來更好的性能。透過此方法，我們在 ModelNet40 的正確率進一步提升到了 93.9%，在 ScanObjectNN 的正確率也與人工設計的架構表現相當。	zh_TW
dc.description.abstract	In recent years, the investment in automatic driving technology has led to rapid growth of 3D point cloud researches. Due to the irregular and unordered properties of 3D point cloud, it is very difficult to capture the geometric features between the points. In this thesis, we propose three methods to improve the ability of capturing point cloud features to improve the accuracy and stability of the point cloud classification task. In the first approach, we introduce two different attention mechanisms: the Point-wise Attention Module, which determines the correlation between points, and the Channel-wise Attention Module, which allows the model to focus more on important features under limited resources. With these attention mechanisms, we not only achieve the state-of-the-art accuracy of 93.7% on the ModelNet40 [36] dataset, but also reduce the error rate ranging from 2.96% to 7.49% on the ScanObjectNN [32] dataset compared to DGCNN. The second method is Dynamic K. We dynamically adjust the size of the KNN to improve the accuracy for low resolution objects. By using this method, we have seen a 2.4% to 434.7% increase in accuracy when dealing with low resolution objects. Finally, the third method utilizes the Neural Architecture Search technique to find a more suitable architecture for the point cloud classification task. The experimental results prove that the neural architecture search method does bring better performance. The proposed NAS method further improves the accuracy to 93.9% on ModelNet40 [36], while on ScanObjectNN [32], the accuracy was comparable to that of the handcrafted architecture.	en
dc.description.provenance	Made available in DSpace on 2021-07-11T15:09:08Z (GMT). No. of bitstreams: 1 U0001-0511202017081900.pdf: 6828835 bytes, checksum: 8410b6daa71beec750bc02b845b8a715 (MD5) Previous issue date: 2020	en
dc.description.tableofcontents	口試委員會審定書 iii 誌謝 v 摘要 vii Abstract ix 1 Introduction.............. 1 1.1 Various Representations of 3D Objects................... 1 1.2 Introduction of 3D Point Cloud Object Classification . . . . . . . . . . . 2 1.3 Contribution................................. 3 1.4 Thesis Organization............................. 4 2 Related Work............... 5 2.1 Deep Learning on 3D Point Cloud Object ................. 5 2.1.1 Transform into Standard Volumetric Grids. . . . . . . . . . . . . 5 2.1.2 Directly Process on Point Cloud Object . . . . . . . . . . . . . . 7 2.1.3 Consider Local Information .................... 8 2.2 Attention Mechanism............................ 9 2.2.1 Spatial Domain........................... 10 2.2.2 Channel Domain .......................... 11 2.3 Neural Architecture Search (NAS)..................... 11 2.3.1 Reinforcement Learning ...................... 12 2.3.2 Evolutionary Algorithms...................... 13 2.3.3 Gradient-based ........................... 13 3 Problem Statement and Datasets 15 3.1 Problem Statement ............................. 15 3.2 Datasets................................... 15 3.2.1 ModelNet40 ............................ 16 3.2.2 ScanObjectNN ........................... 17 4 Handcrafted Architecture Method 21 4.1 Preliminarily : DGCNN........................... 21 4.2 Attention EdgeConv ............................ 23 4.2.1 Point-wise Attention Module (PAM). . . . . . . . . . . . . . . . 24 4.2.2 Channel-wise Attention Module (CAM) . . . . . . . . . . . . . . 25 4.3 Dynamic K Method............................. 26 4.4 Experiments................................. 27 4.4.1 Handcrafted Architecture...................... 27 4.4.2 Training Details........................... 28 4.4.3 ModelNet40 Result......................... 28 4.4.4 ScanObjectNN Result ....................... 31 4.4.5 Robustness Test........................... 35 4.5 Design Analysis............................... 36 4.5.1 Ablation Study ........................... 36 4.5.2 Reduction Ratio .......................... 36 4.5.3 Other Point-wise Attention Mechanisms . . . . . . . . . . . . . . 37 4.5.4 Dealing with Imbalance ModelNet40 Dataset . . . . . . . . . . . 38 4.5.5 Deep Dive into ModelNet40 Dataset. . . . . . . . . . . . . . . . 39 4.5.6 More Experiments of DynamicK ................. 42 5 Neural Architecture Search Method 45 5.1 Preliminarilies : DARTS and PC-DARTS ................. 45 5.1.1 DARTS : Differentiable Architecture Search . . . . . . . . . . . . 46 5.1.2 PC-DARTS : Partial Channel Connections For Memory-Efficient ArchitectureSearch......................... 49 5.2 Our Search and Evaluation Settings .................... 51 5.2.1 Search Settings........................... 51 5.2.2 Evaluation Settings......................... 54 5.3 Experiments................................. 54 5.3.1 Search on ModelNet40....................... 54 5.3.2 ModelNet40 Result......................... 55 5.3.3 Search on ScanObjectNN ..................... 59 5.3.4 ScanObjectNN Result ....................... 59 5.3.5 Robustness Test........................... 64 5.4 Design Analysis............................... 64 5.4.1 Selection of Candidate Operations................. 65 5.4.2 Impact of Decision Ratio...................... 66 6 Conclusion .........69 6.1 Conclusion ................................. 69 6.2 Future Work................................. 69 Bibliography ....71
dc.language.iso	en
dc.subject	動態 K 值調整	zh_TW
dc.subject	點雲分類	zh_TW
dc.subject	注意力機制	zh_TW
dc.subject	神經網路搜索	zh_TW
dc.subject	Dyanmic K	en
dc.subject	Attention Mechanism	en
dc.subject	Point Cloud Classification	en
dc.subject	Neural Architecture Search	en
dc.title	應用於三維點雲分類任務之注意力機制及神經網路搜索架構	zh_TW
dc.title	Attention Mechanism and Neural Architecture Search for Three-dimensional Point Cloud Classification	en
dc.type	Thesis
dc.date.schoolyear	109-1
dc.description.degree	碩士
dc.contributor.oralexamcommittee	連豊力(Feng-Li Lian),丁建均(Jian-Jiun Ding),吳沛遠(Pei-Yuan Wu)
dc.subject.keyword	點雲分類,注意力機制,動態 K 值調整,神經網路搜索,	zh_TW
dc.subject.keyword	Point Cloud Classification,Attention Mechanism,Dyanmic K,Neural Architecture Search,	en
dc.relation.page	75
dc.identifier.doi	10.6342/NTU202004324
dc.rights.note	有償授權
dc.date.accepted	2020-11-09
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	電子工程學研究所	zh_TW
dc.date.embargo-lift	2023-10-31	-
顯示於系所單位：	電子工程學研究所

文件中的檔案：

檔案	大小	格式
U0001-0511202017081900.pdf 未授權公開取用	6.67 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。