基於視覺之使用者界面分割演算法

Yi-An Chen; 陳奕安

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/1161

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	王勝德(Sheng-De Wang)
dc.contributor.author	Yi-An Chen	en
dc.contributor.author	陳奕安	zh_TW
dc.date.accessioned	2021-05-12T09:33:32Z	-
dc.date.available	2019-08-01
dc.date.available	2021-05-12T09:33:32Z	-
dc.date.copyright	2018-08-01
dc.date.issued	2018
dc.date.submitted	2018-07-27
dc.identifier.citation	[1] T. Beltramelli. pix2code: Generating code from a graphical user interface screenshot. CoRR, abs/1705.07962, 2017. [2] D. Cai, S. Yu, J.-R. Wen, and W.-Y. Ma. Vips: a vision-based page segmentation algorithm. 2003. [3] J. Canny. A computational approach to edge detection. In Readings in Computer Vision, pages 184–203. Elsevier, 1987. [4] T.-H. Chang, T. Yeh, and R. Miller. Associating the visual representation of user interfaces with their internal structures and metadata. In Proceedings of the 24th annual ACM symposium on User interface software and technology, pages 245–256. ACM, 2011. [5] C. Chen, T. Su, G. Meng, Z. Xing, and Y. Liu. From ui design image to gui skeleton: A neural machine translator to bootstrap mobile gui implementation. In The 40th International Conference on Software Engineering, Gothenburg, Sweden. ACM, 2018. [6] M. Cormer, R. Mann, K. Moffatt, and R. Cohen. Towards an improved vision-based web page segmentation algorithm. In Computer and Robot Vision (CRV), 2017 14th Conference on, pages 345–352. IEEE, 2017. [7] M. Cormier, K. Moffatt, R. Cohen, and R. Mann. Purely vision-based segmentation of web pages for assistive technology. Computer Vision and Image Understanding, 148:46–66, 2016. [8] B. Deka, Z. Huang, C. Franzen, J. Hibschman, D. Afergan, Y. Li, J. Nichols, and R. Kumar. Rico: A mobile app dataset for building data-driven design applications. In Proceedings of the 30th Annual Symposium on User Interface Software and Technology, UIST ’17, 2017. [9] M. Dixon and J. Fogarty. Prefab: implementing advanced behaviors using pixel-based reverse engineering of interface structure. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pages 1525–1534. ACM, 2010. [10] M. Dixon, D. Leventhal, and J. Fogarty. Content and hierarchy in pixel-based methods for reverse engineering interface structure. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pages 969–978. ACM, 2011. [11] D. Fernandes, E. S. de Moura, A. S. da Silva, B. Ribeiro-Neto, and E. Braga. A site oriented method for segmenting web pages. In Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval, pages 215–224. ACM, 2011. [12] G. Hattori, K. Hoashi, K. Matsumoto, and F. Sugaya. Robust web page segmentation for mobile terminal using content-distances and page layout information. In Proceedings of the 16th international conference on World Wide Web, pages 361–370. ACM, 2007. [13] D. Karatzas, F. Shafait, S. Uchida, M. Iwamura, L. G. i Bigorda, S. R. Mestre, J. Mas, D. F. Mota, J. A. Almazan, and L. P. De Las Heras. Icdar 2013 robust reading competition. In Document Analysis and Recognition (ICDAR), 2013 12th International Conference on, pages 1484–1493. IEEE, 2013. [14] K. Moran, C. Bernal-Cárdenas, M. Curcio, R. Bonett, and D. Poshyvanyk. Machine learning-based prototyping of graphical user interfaces for mobile apps. arXiv preprint arXiv:1802.02312, 2018. [15] A. Pnueli, R. Bergman, S. Schein, and O. Barkol. Web page layout via visual segmentation. HP Laboratories, 2009. [16] R. L. Potter. Pixel Data Access: Interprocess Communication in the User Interface for End-user Programming and Graphical Macros. PhD thesis, College Park, MD, USA, 1999. AAI9926789. [17] S. Ren, K. He, R. Girshick, and J. Sun. Faster r-cnn: Towards real-time object detection with region proposal networks. In Advances in neural information processing systems, pages 91–99, 2015. [18] A. Sanoja and S. Gançarski. Block-o-matic: A web page segmentation framework. In Multimedia Computing and Systems (ICMCS), 2014 International Conference on, pages 595–600. IEEE, 2014. [19] E. Shah and E. Tilevich. Reverse-engineering user interfaces to facilitateporting to and across mobile devices and platforms. In Proceedings of the compilation of the co-located workshops on DSM’11, TMC’11, AGERE! 2011, AOOPES’11, NEAT’11, & VMIL’11, pages 255–260. ACM, 2011. [20] R. Smith. An overview of the tesseract ocr engine. In Document Analysis and Recognition, 2007. ICDAR 2007. Ninth International Conference on, volume 2, pages 629–633. IEEE, 2007. [21] A. Spengler and P. Gallinari. Document structure meets page layout: loopy random fields for web news content extraction. In Proceedings of the 10th ACM symposium on Document engineering, pages 151–160. ACM, 2010. [22] C. S. Win and M. M. S. Thwin. Web page segmentation and informative content extraction for effective information retrieval. IJCCER, 2(2):35–45, 2014. [23] T. Yeh, T.-H. Chang, and R. C. Miller. Sikuli: using gui screenshots for search and automation. In Proceedings of the 22nd annual ACM symposium on User interface software and technology, pages 183–192. ACM, 2009. [24] J. Zeleny, R. Burget, and J. Zendulka. Box clustering segmentation: A new method for vision-based web page preprocessing. Information Processing & Management, 53(3):735–750, 2017.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/handle/123456789/1161	-
dc.description.abstract	圖像分割廣泛的被應用於將視覺表現不同的資訊區隔出來。然而，在使用者界面相關的研究領域，不同使用環境發展出不同的演算法。本篇論文中，我們提出一個統合的、基於視覺的使用者界面分割演算法，能利用使用者界面的截圖即能計算其架構資訊。此演算法首先利用邊緣偵測以及一些定義好的邏輯偵測出使用者界面上的基本元素：方塊、線段、以及圖形輪廓。接著，我們定義一個計算兩元素距離的函數以及一個閥值選擇演算法來進行階層式分群。我們將此演算法運行在網頁界面以及手機應用程式上來評估其性能，並分析評估過程中演算法常見的缺失。	zh_TW
dc.description.abstract	Segmentation is used broadly to differentiate the presentation of different kinds of information. However, methods on different user interface environments tend to develop their own algorithms for this process. In this paper, we propose a unified vision-based segmentation algorithm, called UISeg, that only uses screenshots to estimate structural information of the user interface. The algorithm first leverages edge detection and a set of heuristics to recognize discrete elements such as boxes, lines, and contours. Then, we define a pairwise distance function and a threshold selection algorithm for the hierarchical clustering process. We evaluate the performance of UISeg with screenshots of web pages and mobile applications. Also, we analyze common failure cases among them.	en
dc.description.provenance	Made available in DSpace on 2021-05-12T09:33:32Z (GMT). No. of bitstreams: 1 ntu-107-R05921035-1.pdf: 5635462 bytes, checksum: 5ab0eccc8a3c38b42fe3c6cd38f1229c (MD5) Previous issue date: 2018	en
dc.description.tableofcontents	誌謝 iii 摘要 v Abstract vii 1 Introduction 1 2 Related Work 5 2.1 Web Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.2 Mobile Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.3 Desktop Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 3 Methodology 9 3.1 Text Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 3.2 Contour Detection with Computer Vision Techniques . . . . . . . . . . . 11 3.3 Distance Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 3.4 Hierarchical Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 4 Evaluation 21 4.1 Performance Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 4.2 Failure Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 4.2.1 Floating elements . . . . . . . . . . . . . . . . . . . . . . . . . . 23 4.2.2 wrapping rows . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 4.2.3 Invisible and highlight separators . . . . . . . . . . . . . . . . . 24 4.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 Bibliography 27
dc.language.iso	en
dc.title	基於視覺之使用者界面分割演算法	zh_TW
dc.title	Vision Based User Interface Segmentation Algorithm	en
dc.type	Thesis
dc.date.schoolyear	106-2
dc.description.degree	碩士
dc.contributor.oralexamcommittee	雷欽隆(Chin-Laung Lei),王鈺強(Yu-Chiang Frank Wang),曾俊元(Chinyang Henry Tseng)
dc.subject.keyword	影像分割,使用者界面,人機界面,文件分析,跨平台整合,	zh_TW
dc.subject.keyword	segmentation,user interface,human–computer interaction,document analysis,cross–platform integration,	en
dc.relation.page	30
dc.identifier.doi	10.6342/NTU201802030
dc.rights.note	同意授權(全球公開)
dc.date.accepted	2018-07-27
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	電機工程學研究所	zh_TW
顯示於系所單位：	電機工程學系

文件中的檔案：

檔案	大小	格式
ntu-107-1.pdf	5.5 MB	Adobe PDF	檢視/開啟

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。