全景影像例項語意分割之標注工具

Liang-Han Lin; 林良翰

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/60572

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	歐陽明(Ming Ouhyoung)
dc.contributor.author	Liang-Han Lin	en
dc.contributor.author	林良翰	zh_TW
dc.date.accessioned	2021-06-16T10:22:02Z	-
dc.date.available	2020-08-03
dc.date.copyright	2020-08-03
dc.date.issued	2020
dc.date.submitted	2020-07-14
dc.identifier.citation	[1] Semantic Drone Dataset (http://dronedataset.icg.tugraz.at/). [2] D. Acuna, H. Ling, A. Kar, and S. Fidler. Efficient interactive annotation of segmentation datasets with polygonrnn++. CoRR, abs/1803.09693, 2018. [3] M. Andriluka, J. R. R. Uijlings, and V. Ferrari. Fluid annotation: a humanmachine collaboration interface for full image annotation. CoRR, abs/1806.07527, 2018. [4] G.J.Brostow,J.Shotton,J.Fauqueur,andR.Cipolla.Segmentationandrecognition using structure from motion point clouds. In ECCV (1), pages 44–57, 2008. [5] L. Castrejón, K. Kundu, R. Urtasun, and S. Fidler. Annotating object instances with a polygonrnn. CoRR, abs/1704.05548, 2017. [6] T. S. Cohen, M. Geiger, J. Köhler, and M. Welling. Spherical cnns. In 6th In ternational Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 May 3, 2018, Conference Track Proceedings. OpenReview.net, 2018. [7] M.Cordts,M.Omran,S.Ramos,T.Rehfeld,M.Enzweiler,R.Benenson,U.Franke, S. Roth, and B. Schiele. The cityscapes dataset for semantic urban scene understand ing. In 2016 IEEE Conference on Computer Vision, Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, June 2730, 2016, pages 3213–3223. IEEE Computer Society, 2016. [8] J. Deng, W. Dong, R. Socher, L.J. Li, K. Li, and L. FeiFei. ImageNet: A Large Scale Hierarchical Image Database. In CVPR09, 2009. [9] L. Deng, M. Yang, H. Li, T. Li, B. Hu, and C. Wang. Restricted deformable convolutionbased road scene semantic segmentation using surround view cameras. IEEE Transactions on Intelligent Transportation Systems, page 1–13, 2019. [10] Everingham, V. G. M., W. L., C. K. I., J. Winn, and A. Zisserman. The pascal visual object classes (voc) challenge. International Journal of Computer Vision, 88(2):303– 338, June 2010. [11] M. Fonder and M. V. Droogenbroeck. Midair: A multimodal dataset for extremely low altitude drone flights. In Conference on Computer Vision and Pattern Recogni tion Workshop (CVPRW), June 2019. [12] A.Geiger,P.Lenz,C.Stiller,andR.Urtasun.Visionmeetsrobotics:Thekittidataset. International Journal of Robotics Research (IJRR), 2013. [13] W. Jiang and Y. Wu. Dfnet: Semantic segmentation on panoramic images with dy namic loss weights and residual fusion block, 2018. [14] A. Karakottas, N. Zioulis, S. Samaras, D. Ataloglou, V. Gkitsas, D. Zarpalas, and P. Daras. 360 surface regression with a hypersphere loss. In International Confer ence on 3D Vision, September 2019. [15] A. Krizhevsky. The CIFAR10 dataset. [16] T. Lin, M. Maire, S. J. Belongie, L. D. Bourdev, R. B. Girshick, J. Hays, P. Perona, D. Ramanan, P. Dollár, and C. L. Zitnick. Microsoft COCO: common objects in context. CoRR, abs/1405.0312, 2014. [17] H. Ling, J. Gao, A. Kar, W. Chen, and S. Fidler. Fast interactive object annotation with curvegcn. CoRR, abs/1903.06874, 2019. [18] J. Long, E. Shelhamer, and T. Darrell. Fully convolutional networks for semantic segmentation. CoRR, abs/1411.4038, 2014. [19] R. Mottaghi, X. Chen, X. Liu, N.G. Cho, S.W. Lee, S. Fidler, R. Urtasun, and A. Yuille. The role of context for object detection and semantic segmentation in the wild. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014. [20] G. Neuhold, T. Ollmann, S. Rota Bulò, and P. Kontschieder. The mapillary vistas dataset for semantic understanding of street scenes. In International Conference on Computer Vision (ICCV), 2017. [21] B. Pan, J. Sun, A. Andonian, and B. Zhou. Crossview semantic segmentation for sensing surroundings, 2019. [22] D. P. Papadopoulos, J. R. R. Uijlings, F. Keller, and V. Ferrari. Extreme clicking for efficient object annotation. CoRR, abs/1708.02750, 2017. [23] Robicquet, Alexandre, Sadeghian, Amir, Alahi, Alexandre, Savarese, and Silvio. Learning social etiquette: Human trajectory understanding in crowded scenes. vol ume 9912, pages 549–565, 10 2016. [24] C. Rother, V. Kolmogorov, and A. Blake. “grabcut”: Interactive foreground ex traction using iterated graph cuts. ACM Trans. Graph., 23(3):309–314, Aug. 2004. [25] B. C. Russell, A. Torralba, K. P. Murphy, and W. T. Freeman. Labelme: A database, webbased tool for image annotation. Int. J. Comput. Vis., 77(13):157–173, 2008. [26] M.Savva,F.Yu,H.Su,A.Kanezaki,T.Furuya,R.Ohbuchi,Z.Zhou,R.Yu,S.Bai, X. Bai, M. Aono, A. Tatsuma, S. Thermos, A. Axenopoulos, G. T. Papadopoulos, P. Daras, X. Deng, Z. Lian, B. Li, H. Johan, Y. Lu, and S. Mk. Largescale 3d shape retrieval from shapenet core55. In M. O. Ioannis Pratikakis, Florent Dupont, editor, Eurographics Workshop on 3D Object Retrieval, 3DOR 2017, Lyon, France, April 2324, 2017. Eurographics Association, 2017. [27] Y. Xu, K. Wang, K. Yang, D. Sun, and J. Fu. Semantic segmentation of panoramic images using a synthetic dataset, 2019. [28] M. Yahiaoui, H. Rashed, L. Mariotti, G. Sistu, I. Clancy, L. Yahiaoui, V. R. Ku mar, and S. Yogamani. Fisheyemodnet: Moving object detection on surroundview cameras for autonomous driving, 2019. [29] K. Yang, X. Hu, L. Bergasa, E. Romera, and K. Wang. Pass: Panoramic annular semantic segmentation. IEEE Transactions on Intelligent Transportation Systems, PP:1–15, 09 2019. [30] K. Yang, X. Hu, H. Chen, K. Xiang, K. Wang, and R. Stiefelhagen. Dspass: Detail sensitive panoramic annular semantic segmentation through swaftnet for surround ing sensing, 2019. [31] Z. Zheng, Y. Wei, and Y. Yang. University1652: A multiview multisource bench mark for dronebased geolocalization, 2020. [32] X.Zhou,J.Bai,C.Wang,X.Hou,andK.Wang.Comparisonoftwopanoramicfront unit arrangements in design of a super wide angle panoramic annular lens. Appl. Opt., 55(12):3219–3225, Apr 2016. [33] P. Zhu, L. Wen, D. Du, X. Bian, Q. Hu, and H. Ling. Vision meets drones: Past, present and future. arXiv preprint arXiv:2001.06303, 2020. [34] N. Zioulis, A. Karakottas, D. Zarpalas, F. Alvarez, and P. Daras. Spherical view synthesis for selfsupervised 360o depth estimation. In International Conference on 3D Vision (3DV), September 2019. [35] N. Zioulis, A. Karakottas, D. Zarpalas, and P. Daras. Omnidepth: Dense depth esti mation for indoors spherical panoramas. In Proceedings of the European Conference on Computer Vision (ECCV), pages 448–465, 2018.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/60572	-
dc.description.abstract	為了解決圖像投影扭曲問題以及物件對應問題，我們設計並實作了一款基於網路應用的標注工具，並且命名為Label 360。在標注的格式上，我們定義每一個物件都是由一個球面多邊形去包圍，並紀錄其每個頂點在全景影像上的經緯度位置。另外，我們也實作了一個後處理算法，將標注出的多邊形頂點透過大圓航線公式去計算頂點間的連線，並畫出全景影像中每一個像素的類別。最後我們針對標註工具進行了兩項實驗，第一項是找兩位標註專家來使用Label 360對同一組影像標註，然後檢驗兩者標註結果的一致性，結果得到了整體約0.92的IoU；第二項是找一位標註專家分別使用Label 360和LabelMe去標註同一組影像，從得到的結果中，可以得知我們的標註工具在標註速度上面比LabelMe快1.45倍。從以上兩個實驗可以證明，在全景影像的例項語意分割的資料標註工作上，我們的工具可以有效的去幫助標註者進行標註，並且透過解決投影扭曲、物件對應的問題，來減少對標註人員的負擔以及標註難度，更提升了標註人員的工作效率、提升了整體標注的品質。	zh_TW
dc.description.abstract	We design and develop a web-based annotation tool—Label 360, which aims for solving the distortion and instance correspondence problems during annotating panoramas. We define a new annotation format recording polygon vertices in longitudes and latitudes, and introduce a post-processing algorithm, which connects polygon vertices with bearing formulas to generate pixel-wise label mask of panoramic image. Besides, we conduct two experiments to examine consistency between different annotators using Label 360 and compare labelling efficiency with LabelMe. Our tool obtains IoU 0.92 in consistency test and has annotation speed about 1.45x faster than LabelMe, proving that Label 360 is valid for generating human-annotated semantic segmentation masks for panorama, and it does solve the distortion and correspondence problems, making panorama annotation process easier, more efficient and more accurate.	en
dc.description.provenance	Made available in DSpace on 2021-06-16T10:22:02Z (GMT). No. of bitstreams: 1 U0001-0407202015122000.pdf: 3490490 bytes, checksum: 3d82d55b790ebb11e8beec06f270fbbc (MD5) Previous issue date: 2020	en
dc.description.tableofcontents	口試委員會審定書 . . . iii 誌謝 . . . v Acknowledgements . . . vii 摘要 . . . ix Abstract . . . xi 1 Introduction . . . 1 2 Related Work . . . 3 2.1 Image Annotation Tool . . . 3 2.2 Semantic Segmentation Dataset . . . 4 3 System Design . . . 7 3.1 Layout . . . 7 3.2 UI Details . . . 8 3.2.1 NFOV Viewer . . . 8 3.2.2 Control Panel . . . 11 3.2.3 Data Panel . . . 14 3.2.4 Equirectangular Viewer . . . 15 3.3 Keyboard Shortcuts . . . 15 4 Implementation . . . 17 4.1 Label 360 . . . 17 4.1.1 Setup . . . 18 4.1.2 Backend . . . 19 4.1.3 Frontend . . . 23 4.2 Annotation Format . . . 25 4.3 Postprocessing . . . 28 4.3.1 Problems . . . 28 4.3.2 Algorithm . . . 30 4.3.3 Example . . . 32 5 Experiment . . . 33 5.1 Setup . . . 33 5.2 Metric . . . 36 5.2.1 Annotated Pixel Ratio in 2D and 3D . . . 36 5.2.2 Number of Vertex v.sAnnotation Time . . . 36 5.2.3 Intersection over Union . . . 37 5.3 Result . . . 37 5.3.1 Annotator Consistency . . . 37 5.3.2 Annotation Tool Comparison . . . 39 6 Conclusion . . . 41 A Formula . . . 43 A.1 Notation and Names . . . 43 A.2 Projection . . . 44 A.2.1 NFOV → Sphere . . . 44 A.2.2 Sphere → NFOV . . . 44 A.3 Orthodrome Interpolation . . . 46 A.3.1 Spherical Law of Cosines . . . 46 A.3.2 Midpoint . . . 46 A.3.3 Intermediate Point . . . 47 A.4 Quaternion . . . 48 A.4.1 Euler Angle → Quaternion . . . . . . 48 A.4.2 Quaternion → Euler Angle . . . 48 A.4.3 Quaternion Interpolation . . . 48 B Crowdsourcing . . . 49 B.1 Qualification API . . . 49 B.2 Environment Variables . . . 50 Bibliography . . . 54
dc.language.iso	en
dc.title	全景影像例項語意分割之標注工具	zh_TW
dc.title	Label360: An annotation interface for labelling instance-aware semantic labels on panoramic full images	en
dc.type	Thesis
dc.date.schoolyear	108-2
dc.description.degree	碩士
dc.contributor.oralexamcommittee	李明穗(Ming-Sui Lee),葉正聖(Jeng-Sheng Yeh)
dc.subject.keyword	影像標注工具,影像語意分割,網頁應用程式,球面影像,全景影像,	zh_TW
dc.subject.keyword	image annotation tool,image semantic segmentation,web-based application,spherical images,panorama,	en
dc.relation.page	54
dc.identifier.doi	10.6342/NTU202001310
dc.rights.note	有償授權
dc.date.accepted	2020-07-15
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	資訊網路與多媒體研究所	zh_TW
顯示於系所單位：	資訊網路與多媒體研究所

文件中的檔案：

檔案	大小	格式
U0001-0407202015122000.pdf 目前未授權公開取用	3.41 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。