基於機器學習於內容感知SVG漫畫壓縮及其新應用

Chung-Yuan Su; 蘇忠原

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/68315

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	張瑞益
dc.contributor.author	Chung-Yuan Su	en
dc.contributor.author	蘇忠原	zh_TW
dc.date.accessioned	2021-06-17T02:17:25Z	-
dc.date.available	2022-08-25
dc.date.copyright	2017-08-25
dc.date.issued	2017
dc.date.submitted	2017-08-16
dc.identifier.citation	[1] 2015年度日本電子書籍市場，http://internet.watch.impress.co.jp/docs/news/1012421.html#impress2_s.png. [2] Marvel, http://marvel.com/. [3] SVG, http://www.w3.org/Graphics/SVG/. [4] Ray-I Chang, Yachik Yen, Ting-Yu Hsu, “An XML-based Comic Image Compression,” LECT NOTES COMPUT SC, Vol. 5353, pp. 563-572, 2008. [5] Song-Hai Zhang, Tao Chen, Yi-Fei Zhang, Shi-Min Hu, Martin, R.R., “Vectorizing Cartoon Animations,” IEEE Transactions on Visualization and Computer Graphics, Vol. 15, pp. 618-629, 2009. [6] Kei Kawamura, Hiroshi Watanabe, Hideyoshi Tominaga, “Vector Representation of Binary Images Containing Halftone Dots,” IEEE International Conference on Multimedia and Expo, pp. 335-338, 2004. [7] S. Battiato, G. Gallo, G. Messina, “SVG Rendering of Real Images Using Data Dependent Triangulation,” in: Proc. of ACM/SCCG, pp. 191-198, 2004. [8] S. Battiato, A. Costanzo, G. Di Blasi, G. Gallo, S. Nicotra, “SVG Rendering by Watershed Decomposition,” The International Society for Optical Engineering, pp. 23-32, 2005. [9] Chih-Yuan Yao, Shih-Hsuan Hung, Guo-Wei Li, I-Yu Chen, Reza Adhitya, Yu-Chi Lai, “Manga Vectorization and Manipulation with Procedural Simple Screentone,” IEEE Transactions on Visualization and Computer Graphics, Vol. PP, pp. 1-1, 2016. [10] Vector Magic, http://vectormagic.com/home. [11] Minimizing SVG File Sizes, http://www.w3.org/TR/SVG11/minimize.html. [12] Adobe SVG Optimizing, http://www.adobe.com/svg/workﬂow/optimizing.html. [13] Autotrace, http://autotrace.sourceforge.net/. [14] Canedo-Rodriguez, A., Soohyung Kim, Kim, J.H., Blanco-Fernandez, Y., “English to Spanish Translation of Signboard Images from Mobile Phone Camera,” In Proc. of IEEE SoutheastCon, pp. 356-361, 2009. [15] Tomohiro Nakai, Koichi Kise, Masakazu Iwamura, “Real-Time Retrieval for Images of Documents in Various Languages using a Web Camera,” International Conference on Document Analysis and Recognition, pp. 146-150, 2009. [16] Hideaki Goto, Makoto Tanaka, “Text-Tracking Wearable Camera System for the Blind,” International Conference on Document Analysis and Recognition, pp. 141-145, 2009. [17] Christophe Rigaud, Dimosthenis Karatzas, Joost Van de Weijer, Jean-Christophe Burie, Jean- Marc Ogier, “Automatic Text Localisation in Scanned Comic Books,” International Conference on Computer Vision Theory and Applications, pp. 814-819, 2013. [18] Yuji Aramaki, Yusuke Matsui, Toshihiko Yamasaki, Kiyoharu Aizawa, “Text Detection in Manga by Combining Connected-component-based and Region-based Classifications,” IEEE International Conference on Image Processing, pp. 2901-2905, 2016. [19] Boris Epshtein, Eyal Ofek, Yonatan Wexler, “Detecting Text in Natural Scenes with Stroke Width Transform,” IEEE Conference on Computer Vision and Pattern Recognition, pp. 2963-2970, 2010. [20] Daniel Sy´kora, Jan Buriánek, Jirˇí Zˇára, “Unsupervised Colorization of Black-and-white Cartoons,” in: International Symposium on Non-Photorealistic Animation and Rendering, pp. 121-127, 2004. [21] Daniel Sy´kora, Jan Buriánek, Jirˇí Zˇára, “Colorization of Black-and-white Cartoons,” Image and Vision Computing, pp. 767-852, 2005. [22] Qu Yingge, Tien-Tsin Wong, Pheng-Ann Heng, “Manga Colorization,” ACM Transactions on Graphics, Vol. 25, 1214-1220, 2006. [23] Jian Sun, Lin Liang, Fang Wen, Heung-Yeung Shum, “Image Vectorization Using Optimized Gradient Meshes,” ACM Transactions on Graphics (TOG), vol. 26, pp. 11, 2007. [24] Gregory Lecot, Bruno Levy, “Ardeco: Automatic Region DEtection and COnversion,” in Eurographics Symposium on Rendering, pp. 349-360, 2006. [25] Ruchin Kansal, Subodh Kumar, “A Vectorization Framework for Constant and Linear Gradient Filled Regions,” The Visual Computer, Vol. 31, pp. 717-732, 2015. [26] P.W Huang, S.K Dai, P.L Lin, “Texture Image Retrieval and Image Segmentation Using Composite Sub-band Gradient Vectors,” Journal of Visual Communication and Image Representation, Vol. 17, pp. 947-957, 2006. [27] Hung-Ting Liu, Tony W. H. Sheu, Herng-Hua Chang, “Automatic Segmentation of Brain MR Images Using an Adaptive Balloon Snake Model with Fuzzy Classification,” Med. Biol. Eng. Comput. 51, pp. 1091-1104, 2013. [28] Nawal Houhou, Jean-Philippe Thiran, Xavier Bresso, “Fast Texture Segmentation Model based on the Shape Operator and Active Contour,” IEEE Conference on Computer Vision and Pattern Recognition, pp. 1-8, 2008. [29] Nawal Houhou, Jean-Philippe Thiran, Xavier Bresso, “Fast Texture Segmentation based on Semi-Local Region Descriptor and Active Contour,” Numerical Mathematics: Theory, Methods & Applications, Vol. 2, pp. 445-468, 2009. [30] Kei Kawamura, Hiroshi Watanabe, Hideyoshi Tominaga, “Vector Representation of Binary Images Containing Halftone Dots,” IEEE International Conference on Multimedia and Expo, pp. 335-338, 2004. [31] Kei Kawamura, Yuki Yamamoto, Hiroshi Watanabe, “Gradation Approximation for Vector based Compression of Comic Images,” IEEE International Conference on Image Processing, pp. 11-14, 2005. [32] Chih-Yuan Yao, Shih-Hsuan Hung, Guo-Wei Li, I-Yu Chen, Reza Adhitya, Yu-Chi Lai, “Manga Vectorization and Manipulation with Procedural Simple Screentone,” IEEE Transactions on Visualization and Computer Graphics, Vol. PP, pp. 1-1, 2016. [33] Chung-Yuan Su, Ray-I Chang, Jen-Chang Liu, “Recognizing Text Elements for SVG Comic Compression and Its Novel Applications,” International Conference on Document Analysis and Recognition (ICDAR), pp. 1329-1333, 2011. [34] C.N.E. Anagnostopoulos, I.E. Anagnostopoulos, V. Loumos, E. Kayafas, “A License Plate Recognition Algorithm for Intelligent Transportation System Applications,” IEEE Transactions on Intelligent Transportation Systems, Vol. 7, pp. 377-392, 2006. [35] Wei-Yuan Chen, Shu-Yuan Chen, “Adaptive Page Segmentation for Color Technical Journals' Cover Images,” Image and Vision Computing, Vol. 16, pp. 855-877, 1998. [36] LIBSVM, http://www.csie.ntu.edu.tw/~cjlin/libsvm/. [37] John Canny, “A Computational Approach To Edge Detection,” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol.8, pp.679-714, 1986. [38] Tesseract, https://sourceforge.net/projects/tesseract-ocr/. [39] Jean-Francois Aujol, Guy Gilboa, Tony Chan, Stanley Osher, “Structure-Texture Image Decomposition - Modeling, Algorithms, and Parameter Selection,” International Journal of Computer Vision, pp. 111-136, 2006. [40] Vidya Manian, Ramón Vásquez, “Scaled and Rotated Texture Classification Using a Class of Basis Functions,” Pattern Recognition, Vol. 31, pp. 1937-1948, 1998. [41] Bernhard Schölkopf, John Platt, J. John Shawe-Taylor, Alex J. Smola, Robert C. Williamson, “Estimating the Support of a High-Dimensional Distribution,” Neural Computation, Vol.13, pp. 1443-1471, 2001. [42] Alexander Senf, Xue-wen Chen, Anne Zhang, “Comparison of One-Class SVM and Two-Class SVM for Fold Recognition,” Neural Information Processing, Vol. 4233, pp. 140-149, 2006. [43] Nawal Houhou, Jean-Philippe Thiran, Xavier Bresso, “Fast Texture Segmentation Model based on the Shape Operator and Active Contour,” IEEE Conference on Computer Vision and Pattern Recognition, pp. 1-8, 2008. [44] Nawal Houhou, Jean-Philippe Thiran, Xavier Bresso, “Fast Texture Segmentation based on Semi-Local Region Descriptor and Active Contour,” Numerical Mathematics: Theory, Methods & Applications, Vol. 2, pp. 445-468, 2009. [45] ApacheTM batik SVG toolkit, https://xmlgraphics.apache.org/batik/. [46] Rodrigo Minetto, Nicolas Thome, Matthieu Cord, Neucimar J. Leite, Jorge Stolfi, “SnooperText: A text detection system for automatic indexing of urban scenes,” Computer Vision and Image Understanding (CVIU), Vol. 122, pp. 92-104, 2014. [47] Simon M. Lucas, “ICDAR 2005 Text Locating Competition Results,” International Conference on Document Analysis and Recognition (ICDAR), pp. 80-84, 2005. [48] Ray Smith, “An Overview of the Tesseract OCR Engine,” International Conference on Document Analysis and Recognition (ICDAR), pp. 629-633, 2007. [49] Weisi Lin, C.-C. Jay Kuo, “Perceptual visual quality metrics: A survey,” Journal of Visual Communication and Image Representation, Vol. 22, pp. 297-312, 2011. [50] Ya-Hui Shiao, Tzong-Jer Chen, Keh-Shih Chuang, Cheng-Hsun Lin, and Chun-Chao Chuang, “Quality of Compressed Medical Images,” Journal of Digit Imaging, Vol. 20, pp. 149-159, 2007. [51] OPEN NLPTM, https://opennlp.apache.org/. [52] RANKS NL, http://www.ranks.nl/stopwords. [53] Martin Porter, “An Algorithm for Suffix Stripping,” Program, Vol. 14, pp. 130-137, 1980. [54] WordNet, https://wordnet.princeton.edu/. [55] Dhiraj Joshi, James Z. Wang, Jia Li, “The Story Picturing Engine-A System for Automatic Text Illustration,” ACM Transactions on Multimedia Computing, Communications and Applications, Vol. 2, pp. 1-22, 2006. [56] American Literature, https://americanliterature.com/. [57] Hsin-Chia Chen, Sheng-Jyh Wang, “The Use of Visible Color Difference in the Quantitative Evaluation of Color Image Segmentation”, IEEE International Conference on Acoustics, Speech, and Signal Processing, Vol. 3, pp. 593-596, 2004. [58] Navneet Dalal and Bill Triggs, “Histograms of Oriented Gradients for Human Detection,” Computer Vision and Pattern Recognition (CVPR), pp. 886-893, 2005. [59] 林琮翰（2014）。材質分割與分類於SVG漫畫壓縮之應用。臺灣大學工程科學及海洋工程學研究所學位論文。 [60] 潘柏沇（2012）。使用顏色漸層向量於SVG漫畫影像壓縮。臺灣大學工程科學及海洋工程學研究所學位論文。 [61] Tomasi Carlo, Roberto Manduchi, “Bilateral filtering for gray and color images,” IEEE International Conference on Computer Vision, pp. 839-846, 1998.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/68315	-
dc.description.abstract	SVG (Scalable Vector Graphics，可縮放向量圖形) 已成為HTML5中描述2D圖形的國際標準格式，也是EPUB電子書漫畫內容的國際標準格式。雖然現今有許多點陣圖轉SVG系統被提出來，但轉換後所產生的檔案較大，而且影像的視覺品質往往遭到破壞。因此，我們之前已提出新的影像處理技術來降低點陣圖轉SVG後的檔案大小，其壓縮率優於前人的方法。然而我們的技術並沒有針對漫畫中的特別內容，如文字、漸層、網點等物件去處理及優化。為了進一步降低SVG漫畫檔案的大小，本論文針對以上特別內容提出處理方法。我們將點陣圖轉換成SVG檔案後進行文字、顏色漸層和紋理的偵測與辨識並將其嵌入於SVG檔案中。在文字部分，利用SWT（Stroke Width Transform，筆畫寬度轉換）和幾何規則濾除非文字元件，並結合HOG（Histogram of Oriented Gradient，方向梯度直方圖）和SVM（Support Vector Machine，支援向量機）進一步降低假陽性。接下來，利用OCR（Optical Character Recognition，光學字元識別）辨識文字區域。為了避免將文字區域向量化，OCR的結果與其坐標值一起嵌入於SVG文件中。在顏色漸層中，我們提出了CGV（Color Gradient Vectorization，顏色梯度向量化）方法來解決這個問題。首先利用時間複雜度為線性時間的CGV演算法來識別每個區域中的顏色和梯度方向。然後，我們將相鄰區域中具有相同的顏色和梯度方向合併成一較大區域，並以漸層語法來表示其SVG路徑。在紋理中，我們的方法使用CSG（Composite Sub-band Gradient，複合式子帶梯度）向量作為紋理描述子，並使用SVM對漫畫中的紋理區域進行分類。然後，結合ACM（Active Contour Model，主動式輪廓模型）提高輪廓區域的分割準確度。實驗結果顯示我們所提出的方法不僅在SVG漫畫檔案大小和視覺觀看品質皆優於其他最先進的SVG向量化系統，更可提供在現代手持設備上顯示較佳的效能。最後我們可以進一步將漫畫翻譯成其他語言，輕鬆提供多語言服務。可以有效率地基於文字或內容的影像搜索。它還可以為數位說故事者提供一個創新的應用系統。	zh_TW
dc.description.abstract	SVG has become the standard format for 2D graphics in HTML5 and EPUB. Although some image-to-SVG conversion systems had been proposed, the sizes of files they produced are still large. We proposed a new system to convert raster comic images into vector SVG files. The compression ratio is better than the previous methods. However, these methods do not process the contents of the comics. Such as text elements, color gradient, and texture in the image. In this dissertation, we convert comic raster images to SVG files and recognize/embed text elements, color gradient, and texture in the SVG files. In the text, the SWT is applied and geometric filtering is used to filter out non-text elements. We combine HOG with SVM to further reduce false positive. Next, OCR task is applied to real text areas. Instead of encoding the text regions as vectors, the text elements are embedded in the SVG file along with their coordinate values. In color gradient, the proposed CGV (CG vectorization) first applies a linear-time algorithm to identify the CG vector for representing the color and the direction of CG in each region. Then, we merge neighboring regions those have the same CG vector as a large CG region and represent it by a single path of SVG with linear gradient syntax. In texture, our method uses CSG vector as texture descriptor and uses SVM to classify texture area in the comic. Then, the ACM combining with CSG vectors is introduced to improve the segmentation accuracy on contour regions. Experimental results show that our method outperforms other state-of-the-art SVG vectorization systems in terms of not only SVG size but also perceptual quality. It lets vectorized comics have the higher performance to be illustrated on modern e-book devices. Using these text elements, we can further translate comics into other languages to provide multilingual services easily. Text/content-based image search can be supported efficiently. It can also provide a novel application system for digital storytelling.	en
dc.description.provenance	Made available in DSpace on 2021-06-17T02:17:25Z (GMT). No. of bitstreams: 1 ntu-106-D98525007-1.pdf: 6174626 bytes, checksum: 4bfa2f469c219bfb7bb7c5be8cf05d94 (MD5) Previous issue date: 2017	en
dc.description.tableofcontents	誌謝 i 摘要 ii ABSTRACT iii CONTENTS iv LIST OF FIGURES vii LIST OF TABLES xii 縮寫對照表 xiii Chapter 1 Introduction 1 1.1 Research background 1 1.1.1 Comic 1 1.1.2 SVG 2 1.2 Motivation 3 1.3 Contributions 7 Chapter 2 Related Works 9 2.1 Vectorization 9 2.2 Text 10 2.3 Color gradient 11 2.4 Texture 12 Chapter 3 Proposed Methods 14 3.1 Noise-reducing filter 14 3.2 Basic vectorization 14 3.2.1 Autotrace 14 3.2.2 Color clustering and middle point detection 14 3.3 Text elements processing 16 3.3.1 SCW (Sliding Concentric Window) segmentation method 17 3.3.2 SWT (Stroke Width Transform)+HOG (Histogram of Oriented Gradient) segmentation method 20 3.3.3 Embedding characters in SVG file 24 3.4 Color gradient processing 25 3.4.1 CGV (Color Gradient Vectorization) 26 3.4.2 IVCS (Improved Vector Contour Searching) 28 3.4.3 Post-processing 30 3.4.4 Linear gradient approximation 32 3.5 Texture elements processing 33 3.5.1 Texture segmentation 34 3.5.2 Pattern approximation 41 Chapter 4 Experimental Results 43 4.1 Performance of the text detectors 45 4.1.1 Parameter settings 45 4.1.2 The metrics of text detectors 46 4.2 Performance of color gradient processing 48 4.2.1 Comparison 48 4.2.2 Quality measurement 51 4.2.3 Size measurement 54 4.2.4 Rendering time measurement 55 4.2.5 Extensions 57 4.3 Performance of texture elements 58 4.3.1 Parameters setting 58 4.3.2 Segmentation performance 58 4.3.3 Size measurement 60 4.3.4 Rendering time measurement 62 Chapter 5 Novel Applications 63 5.1 Multilingual services 63 5.2 Text/content-based image search 63 5.3 Digital storytelling 64 5.3.1 Text pre-processing 65 5.3.2 Lexical similarity 66 Chapter 6 Conclusions 69 REFERENCES 70
dc.language.iso	en
dc.subject	SVG	zh_TW
dc.subject	數位說故事	zh_TW
dc.subject	顏色漸層向量	zh_TW
dc.subject	紋理識別	zh_TW
dc.subject	機器學習	zh_TW
dc.subject	文字識別	zh_TW
dc.subject	向量壓縮	zh_TW
dc.subject	Color gradient vector	en
dc.subject	Machine learning	en
dc.subject	Digital storytelling	en
dc.subject	SVG	en
dc.subject	Texture recognition	en
dc.subject	Text recognition	en
dc.subject	Vector compression	en
dc.title	基於機器學習於內容感知SVG漫畫壓縮及其新應用	zh_TW
dc.title	Machine Learning for Content-Aware SVG Comic Compression and its Novel Applications	en
dc.type	Thesis
dc.date.schoolyear	106-1
dc.description.degree	博士
dc.contributor.oralexamcommittee	丁肇隆,劉震昌,尹邦嚴,呂承諭,王家輝
dc.subject.keyword	SVG,向量壓縮,文字識別,機器學習,紋理識別,顏色漸層向量,數位說故事,	zh_TW
dc.subject.keyword	SVG,Vector compression,Text recognition,Machine learning,Texture recognition,Color gradient vector,Digital storytelling,	en
dc.relation.page	75
dc.identifier.doi	10.6342/NTU201703615
dc.rights.note	有償授權
dc.date.accepted	2017-08-17
dc.contributor.author-college	工學院	zh_TW
dc.contributor.author-dept	工程科學及海洋工程學研究所	zh_TW
顯示於系所單位：	工程科學及海洋工程學系

文件中的檔案：

檔案	大小	格式
ntu-106-1.pdf 未授權公開取用	6.03 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。