以多視圖序列學習作基於圖像之三維模型跨域搜索

Tang Lee; 李唐

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/66990

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	徐宏民
dc.contributor.author	Tang Lee	en
dc.contributor.author	李唐	zh_TW
dc.date.accessioned	2021-06-17T01:16:32Z	-
dc.date.available	2022-08-24
dc.date.copyright	2017-08-24
dc.date.issued	2017
dc.date.submitted	2017-08-14
dc.identifier.citation	[1] M. Allen, L. Girod, R. Newton, S. Madden, D. T. Blumstein, and D. Estrin. Voxnet: An interactive, rapidly-deployable acoustic monitoring platform. In IPSN, 2008. [2] B. Amos, B. Ludwiczuk, and M. Satyanarayanan. Openface: A general-purpose face recognition library with mobile applications. Technical report, CMU-CS-16- 118, CMU School of Computer Science, 2016. [3] K. Chatfield, K. Simonyan, A. Vedaldi, and A. Zisserman. Return of the devil in the details: Delving deep into convolutional nets. In BMVC, 2014. [4] C. B. Choy, D. Xu, J. Gwak, K. Chen, and S. Savarese. 3d-r2n2: A unified approach for single and multi-view 3d object reconstruction. In ECCV, 2016. [5] M. A. et al. Tensorflow: Large-scale machine learning on heterogeneous distributed systems, 2015. [6] R. Girdhar, D. F. Fouhey, M. Rodriguez, and A. Gupta. Learning a predictable and generative vector representation for objects. In ECCV, 2016. [7] E. Hoffer and N. Ailon. Deep metric learning using triplet network. In International Workshop on Similarity-Based Pattern Recognition, 2015. [8] A. Krizhevsky, I. Sutskever, and G. E. H. Hinton. Imagenet classification with deep convolutional neural networks. In NIPS, 2012. [9] B. Li, Y. Lu, C. Li, A. Godil, T. Schreck, M. Aono, M. Burtscher, Q. Chen, N. K. Chowdhury, B. Fang, et al. A comparison of 3d shape retrieval methods based on a large-scale benchmark supporting multimodal queries. Computer Vision and Image Understanding, 131:1–27, 2015. [10] B. Li, Y. Lu, C. Li, A. Godil, T. Schreck, M. Aono, M. Burtscher, H. Fu, T. Furuya, H. Johan, et al. Shrec’14 track: extended large scale sketch-based 3d shape retrieval. In Eurographics workshop on 3D object retrieval, volume 2014, 2014. [11] Y. Li, H. Su, C. R. Qi, N. Fish, D. Cohen-Or, and L. J. Guibas. Joint embeddings of shapes and images via cnn image purification. ACM Transactions on Graph, 2015. [12] J. J. Lim, H. Pirsiavash, and A. Torralba. Parsing ikea objects: Fine pose estimation. In ICCV, 2013. [13] F. Massa, B. Russell, and M. Aubry. Deep exemplar 2d-3d detection by adapting from real to rendered views. In CVPR, 2016. [14] B. T. Phong. Illumination for computer generated pictures. Communications of the ACM, 1975. [15] C.R.Qi,H.Su,M.Niessner,A.Dai,M.Yan,andL.J.Guibas.Volumetricandmulti- view cnns for object classification on 3d data. arXiv preprint arXiv:1604.03265, 2016. [16] O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpa- thy, A. Khosla, M. Bernstein, A. C. Berg, and L. Fei-Fei. Imagenet large scale visual recognition challenge. IJCV, 2015. [17] A. M. Saxe, J. L. McClelland, and S. Ganguli. Exact solutions to the non- linear dynamics of learning in deep linear neural networks. In arXiv preprint arXiv:1312.6120, 2013. [18] F. Schroff, D. Kalenichenko, and J. Philbin. Facenet: A unified embedding for face recognition and clustering. In CVPR, 2015. [19] B. Shi, S. Bai, Z. Zhou, and X. Bai. Deeppano: Deep panoramic representation for 3d shape recognition. IEEE Signal Processing Letters, 2015. [20] H. Su, S. Maji, E. Kalogerakis, and E. Learned-Miller. Multi-view convolutional neural networks for 3d shape recognition. In ICCV, 2015. [21] F. P. Tasse and N. Dodgson. Shape2vec: semantic-based descriptors for 3d shapes, sketches and images. TOG, 2016. [22] F. Wang, L. Kang, and Y. Li. Sketch-based 3d shape retrieval using convolutional neural networks. In CVPR, 2015. [23] J. Wang, Y. Song, T. Leung, C. Rosenberg, J. Wang, J. Philbin, B. Chen, and Y. Wu. Learning fine-grained image similarity with deep ranking. In CVPR, 2014. [24] Z. Wu, S. Song, A. Khosla, F. Yu, L. Zhang, X. Tang, and J. Xiao. 3d shapenets: A deep representation for volumetric shapes. In CVPR, 2015. [25] Y. Xiang, R. Mottaghi, and S. Savarese. Beyond pascal: A benchmark for 3d object detection in the wild. In WACV, 2014. [26] Q. Yu, F. Liu, Y.-Z. Song, T. Xiang, T. M. Hospedales, and C. C. Loy. Sketch me that shoe. In CVPR, 2016.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/66990	-
dc.description.abstract	我們提出一個用於跨領域基於自然圖片之三維模型搜尋的方法，可端對端學習圖片及三維模型共同的特徵空間。我們可根據圖片和三維模型之相似度搜尋，相似度則可由二者在特徵空間中的距離求得。首先，我們提出一個三維模型的特徵抽取方法，稱為跨視圖卷積 (cross-view convolution, CVC)。跨視圖卷積將三維模型之不同角度的二維視圖特徵根據其順序結合，以得出三維模型的整體特徵。為拉近二維自然圖片特徵和三維模型特徵之間領域的差異，我們提出了跨領域三元神經網路 (cross-domain triplet neural network, CDTNN)。該模型在類神經網路中加入一個轉換層，使得圖片特徵經過轉換後能直接與三維模型特徵比較。該模型可以端對端地訓練。最後，我們提出加速版本的跨領域三元神經網路訓練的方法，大幅減少訓練時間。為實驗模型有效性，我們建立了一個龐大的資料集，其中包含自然圖片和三維模型。實驗結果顯示，我們的方法勝過其他當前最好的方法。同時我們也實驗了各種不同的網路結構設計，以減少記憶體及計算資源的使用。	zh_TW
dc.description.abstract	We propose a cross-domain image-based 3D shape retrieval method, which learns a joint embedding space for natural images and 3D shapes in an end-to-end manner. The similarities between images and 3D shapes can be computed as the distances in this embedding space. To better encode a 3D shape, we propose a new feature aggregation method, Cross-View Convolution (CVC), which models a 3D shape as a sequence of rendered views. For bridging the gaps between images and 3D shapes, we propose a Cross-Domain Triplet Neural Network (CDTNN) that incorporates an adaptation layer to match the features from different domains better and can be trained end-to-end. In addition, we speed up the triplet training process by presenting a new fast cross-domain triplet neural network architecture. We evaluate our method on a new image to 3D shape dataset. Experimental results demonstrate that our method outperforms the state-of-the-art approaches in terms of retrieval performance. We also provide in-depth analysis of various design choices to further reduce the memory storage and computational cost.	en
dc.description.provenance	Made available in DSpace on 2021-06-17T01:16:32Z (GMT). No. of bitstreams: 1 ntu-106-R04921036-1.pdf: 5814815 bytes, checksum: 38b0c9c255a62421ad98ac5a3215290a (MD5) Previous issue date: 2017	en
dc.description.tableofcontents	誌謝 iii 摘要 iv Abstract v 1 Introduction 1 2 Related Work 5 3 Method 7 3.1 3D Shape Representation 7 3.2 Cross-View Convolution (CVC) 7 3.3 Jointly Embedding Images and 3D Shapes 9 3.3.1 Adaptation Layer 9 3.3.2 Fast Cross-Domain Triplet Neural Network 10 3.4 Implementation 12 3.5 Data Preprocessing 12 4 Experiments 13 4.1 Dataset 13 4.2 3D Shape Classification 13 4.3 Image-based 3D Shape Retrieval 14 4.4 Ablation Study 17 5 Conclusions 19 References 20
dc.language.iso	en
dc.subject	跨域度量學習	zh_TW
dc.subject	卷積神經網路	zh_TW
dc.subject	三維模型	zh_TW
dc.subject	三元神經網路	zh_TW
dc.subject	3D Shape	en
dc.subject	Convolutional Neural Network	en
dc.subject	Triplet Neural Network	en
dc.subject	Cross-Domain Metric Learning	en
dc.title	以多視圖序列學習作基於圖像之三維模型跨域搜索	zh_TW
dc.title	Cross-Domain Image-Based 3D Shape Retrieval by View Sequence Learning	en
dc.type	Thesis
dc.date.schoolyear	105-2
dc.description.degree	碩士
dc.contributor.coadvisor	黃寶儀
dc.contributor.oralexamcommittee	陳文進,葉梅珍,黃鐘揚
dc.subject.keyword	三維模型,卷積神經網路,三元神經網路,跨域度量學習,	zh_TW
dc.subject.keyword	3D Shape,Convolutional Neural Network,Triplet Neural Network,Cross-Domain Metric Learning,	en
dc.relation.page	22
dc.identifier.doi	10.6342/NTU201703061
dc.rights.note	有償授權
dc.date.accepted	2017-08-14
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	電機工程學研究所	zh_TW
顯示於系所單位：	電機工程學系

文件中的檔案：

檔案	大小	格式
ntu-106-1.pdf 未授權公開取用	5.68 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。