利用跨視角學習與空間估計辨識四軸飛行器視角的建築物

Chun-Wei Chen; 陳俊瑋

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/66922

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	徐宏民(Winston H. Hsu)
dc.contributor.author	Chun-Wei Chen	en
dc.contributor.author	陳俊瑋	zh_TW
dc.date.accessioned	2021-06-17T01:14:55Z	-
dc.date.available	2022-08-24
dc.date.copyright	2017-08-24
dc.date.issued	2017
dc.date.submitted	2017-08-15
dc.identifier.citation	[1]'Google search.' https://www.google.com. [2]'Google street-view.' https://www.google.com/streetview/. [3]'Google map.' https://developers.google.com/maps/. [4]T.-Y. Lin, Y. Cui, S. Belongie, and J. Hays, 'Learning deep representationsfor ground-to-aerial geolocalization,' in 2015 IEEE Conference on ComputerVision and Pattern Recognition (CVPR), pp. 5007–5015, IEEE, 2015. [5]M. Wolff, R. T. Collins, and Y. Liu, 'Regularity-driven facade matching be-tween aerial and street views,' in The IEEE Conference on Computer Visionand Pattern Recognition (CVPR), June 2016. [6]A.-J. Cheng, F.-E. Lin, Y.-H. Kuo, and W. H. Hsu, 'Gps, compass, or camera?:investigating effective mobile sensors for automatic search-based image annota-tion,' in Proceedings of the 18th ACM international conference on Multimedia,pp. 815–818, ACM, 2010. [7]'Google place api.' https://developers.google.com/places/. [8]V. Roberge, M. Tarbouchi, and G. Labonté, 'Comparison of parallel geneticalgorithm and particle swarm optimization for real-time uav path planning,'IEEE Transactions on Industrial Informatics, vol. 9, no. 1, pp. 132–141, 2013. [9]Y. Bazi, S. Malek, N. Alajlan, and H. AlHichri, 'An automatic approach forpalm tree counting in uav images,' in 2014 IEEE Geoscience and Remote Sens-ing Symposium, pp. 537–540, IEEE, 2014. [10]M. Mueller, N. Smith, and B. Ghanem, 'A benchmark and simulator for uavtracking,' in European Conference on Computer Vision, pp. 445–461, Springer,2016. [11]H. Altwaijry, E. Trulls, J. Hays, P. Fua, and S. Belongie, 'Learning to matchaerial images with deep attentive architectures,' in Proceedings of the IEEEConference on Computer Vision and Pattern Recognition, no. EPFL-CONF-217963, 2016. [12]D. G. Lowe, 'Distinctive image features from scale-invariant keypoints,' Inter-national journal of computer vision, vol. 60, no. 2, pp. 91–110, 2004. [13]N. Dalal and B. Triggs, 'Histograms of oriented gradients for human detection,'in 2005 IEEE Computer Society Conference on Computer Vision and PatternRecognition (CVPR’05), vol. 1, pp. 886–893, IEEE, 2005. [14]A. Krizhevsky, I. Sutskever, and G. E. Hinton, 'Imagenet classification withdeep convolutional neural networks,' in Advances in neural information pro-cessing systems, pp. 1097–1105, 2012. [15]B. Zhou, A. Lapedriza, J. Xiao, A. Torralba, and A. Oliva, 'Supplementarymaterials learning deep features for scene recognition using places database,' [16]J. Wang, Y. Song, T. Leung, C. Rosenberg, J. Wang, J. Philbin, B. Chen, andY. Wu, 'Learning fine-grained image similarity with deep ranking,' in Proceed-ings of the IEEE Conference on Computer Vision and Pattern Recognition,pp. 1386–1393, 2014. [17]F. Schroff, D. Kalenichenko, and J. Philbin, 'Facenet: A unified embeddingfor face recognition and clustering,' in Proceedings of the IEEE Conference onComputer Vision and Pattern Recognition, pp. 815–823, 2015. [18]H. Su, S. Maji, E. Kalogerakis, and E. Learned-Miller, 'Multi-view convolu-tional neural networks for 3d shape recognition,' in Proceedings of the IEEEinternational conference on computer vision, pp. 945–953, 2015. [19]C. L. Zitnick and P. Dollár, 'Edge boxes: Locating object proposals fromedges,' in European Conference on Computer Vision, pp. 391–405, Springer,2014. [20]J. R. Uijlings, K. E. van de Sande, T. Gevers, and A. W. Smeulders, 'Selectivesearch for object recognition,' International journal of computer vision, vol. 104,no. 2, pp. 154–171, 2013. [21]J. Pont-Tuset, P. Arbelaez, J. Barron, F. Marques, and J. Malik, 'Multiscalecombinatorial grouping for image segmentation and object proposal genera-tion,' 2015. [22]S. Ren, K. He, R. Girshick, and J. Sun, 'Faster r-cnn: Towards real-time objectdetection with region proposal networks,' in Advances in neural informationprocessing systems, pp. 91–99, 2015. [23]J. D. Wegner, S. Branson, D. Hall, K. Schindler, and P. Perona, 'Catalogingpublic objects using aerial and street-level images-urban trees,' in Proceed-ings of the IEEE Conference on Computer Vision and Pattern Recognition,pp. 6014–6023, 2016. [24]F. Riaz, A. Hassan, S. Rehman, and U. Qamar, 'Texture classification usingrotation-and scale-invariant gabor texture features,' IEEE Signal ProcessingLetters, vol. 20, no. 6, pp. 607–610, 2013. [25]A. M. Rahimi, R. Ruschel, and B. Manjunath, 'Uav sensor fusion with latent-dynamic conditional random fields in coronal plane estimation,' in Proceed-ings of the IEEE Conference on Computer Vision and Pattern Recognition,pp. 4527–4534, 2016. [26]K. Simonyan and A. Zisserman, 'Very deep convolutional networks for large-scale image recognition,' arXiv preprint arXiv:1409.1556, 2014. [27]K. He, X. Zhang, S. Ren, and J. Sun, 'Deep residual learning for image recogni-tion,' in Proceedings of the IEEE Conference on Computer Vision and PatternRecognition, pp. 770–778, 2016.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/66922	-
dc.description.abstract	近年來，四軸飛行器日趨盛行，同時有別於傳統攝影機額外裝備了多種傳感器（可以同時得到影像與四軸的地理位置），同時為了實踐四軸相關應用，提供環境相關資訊（例如周遭建築物資訊）是非常重要的。因此我們定義了這樣的四軸飛行器視角建築物辨識的問題：給予一個地標還有他的若干張影像和地理位置，以及四軸的地理位置，在四軸視角影像中找出最有可能的建築物影像。然而雖然當前很少有標記的四軸影像，但我們有許多其他視角的影像可以利用像是地面上拍的，或者街景圖與高空圖。因此，我們提出了一個跨視角三重態神經網路去學習四軸視角與其他視角的視覺相似度。此外，我們進一步考慮了空間估計：每一個建築物影像的四軸角度與四軸長度，利用四軸與地標的地理位置空間資訊來改進這樣困難的跨視角視覺搜尋。除此之外，因為缺乏標記的四軸視角資料我們也收集了嶄新的四軸視角數據集（Drone-BR）。接著我們計算並實驗了不同的神經網路並研究出在不同情況下如何得到最好的表現。最後，我們提出的方法比最新卷積網路要好0.29 mAP，能讓四軸更深入地了解他的周遭環境。	zh_TW
dc.description.abstract	Recently, drones become more popular and equip several types of sensors (for image and geo-location). Simultaneously, to enable drone-based applications, it is essential to provide related information (e.g., building information) to understand the environment around the drone. We frame this extbf{drone-view building identification} as building retrieval problem: given a building (multimodal query) with its images, geo-location and drone's current location, retrieve the most likely proposal (building candidate) in a drone-view image. Although there are few annotated drone-view images to date, fortunately, there are a lot of images from other viewpoints, such as ground-level, street-view and aerial images. Hence, we propose a extit{cross-view triplet neural network} to learn visual similarity between drone-view and other views. In addition, we further consider spatial estimation ( extit{drone-angle} and extit{drone-distance}) for each building proposal to utilize drone's geo-location on geographic map in order to solve this challenging cross-view image retrieval problem. Moreover, we collect a new drone-view dataset ( extit{Drone-BR}) on our own owing to the lack of annotated drone-view dataset. We evaluate different neural networks and investigate how to achieve the best performance on various conditions. Finally, our method outperforms state-of-the-art approaches (CNN features) by 0.29 mAP, which indeed helps drones more deeply understand surroundings.	en
dc.description.provenance	Made available in DSpace on 2021-06-17T01:14:55Z (GMT). No. of bitstreams: 1 ntu-106-R04922050-1.pdf: 14401859 bytes, checksum: ab4eacb3c734b23bf283500b308941d6 (MD5) Previous issue date: 2017	en
dc.description.tableofcontents	摘要 ii Abstract iii 1 Introduction 1 1.1 Importance of Drone 1 1.2 Difficulties with Drone 2 1.3 Collect dataset and propose method 3 1.4 Contributions 4 2 Related Work 5 2.1 Drone-based research 5 2.2 Visual Features 5 2.3 Spatial Information 6 3 Dataset 7 4 Building Identification System 9 4.1 Problem Definition 9 4.2 Drone-View Building detection 10 4.3 Similarity Scores for Retrieval 11 5 Cross-View Visual Learning 12 5.1 Transfer Learning: Pre-Trained on IG-City8 12 5.2 Drone-based Triplet Neural Network (DT) 13 5.3 Cross-View Drone-based Triplet (CVDT) 14 6 Spatial Estimation for Proposals 16 6.1 Drone-angle (Da) 16 6.2 Drone-distance (Dx and Dy) 17 7 Experiments 20 7.1 Transfer Learning: the Effect of initial Models 21 7.2 Cross-View Visual Learning 22 7.3 Retrieval with Spatial Estimation 23 8 Conclusions and Future Work 27 Bibliography 28
dc.language.iso	en
dc.subject	四軸飛行器	zh_TW
dc.subject	四軸飛行器視角	zh_TW
dc.subject	建築物辨識	zh_TW
dc.subject	Drone	en
dc.subject	Drone-View Image	en
dc.subject	Building Identification	en
dc.title	利用跨視角學習與空間估計辨識四軸飛行器視角的建築物	zh_TW
dc.title	Drone-View Building Identification by Cross-View Visual Learning and Spatial Estimation	en
dc.type	Thesis
dc.date.schoolyear	105-2
dc.description.degree	碩士
dc.contributor.oralexamcommittee	陳文進(Wen-Chin Chen),葉梅珍(Mei-Chen Yeh),余能豪(Neng-Hao Yu)
dc.subject.keyword	四軸飛行器,四軸飛行器視角,建築物辨識,	zh_TW
dc.subject.keyword	Drone,Drone-View Image,Building Identification,	en
dc.relation.page	31
dc.identifier.doi	10.6342/NTU201703157
dc.rights.note	有償授權
dc.date.accepted	2017-08-15
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	資訊工程學研究所	zh_TW
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
ntu-106-1.pdf 未授權公開取用	14.06 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。