視覺語言模型輔助之風格感知向量草圖補全

秦孝媛; Hsiao Yuan Chin

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/98493

標題:	視覺語言模型輔助之風格感知向量草圖補全 AutoSketch: VLM-Assisted Style-Aware Vector Sketch Completion
作者:	秦孝媛 Hsiao Yuan Chin
指導教授:	陳炳宇 Bing-Yu Chen
關鍵字:	向量草圖,草圖補全風格感知場景補全貝茲曲線 Vector Sketches,Sketch CompletionStyle-AwareScene CompletionBézier Curves
出版年 :	2025
學位:	碩士
摘要:	草圖是重要的表達媒介，近年來已有眾多研究致力於自動草圖生成。其中一項對業餘使用者極具實用性的功能，是根據文字描述自動補全部分草圖以生成複雜場景，同時保留原始草圖的風格。現有方法僅著重於產出符合輸入提示內容、且具預設風格的草圖，而忽略了輸入部分草圖中的風格特徵，例如整體的抽象程度與局部筆劃風格等。為解決此挑戰，我們提出 AutoSketch ，一種能適應多樣化草圖風格並支援多輪補全的風格感知向量草圖補全方法。AutoSketch 透過兩階段的流程，風格一致地補全輸入草圖。在第一階段，我們首先優化筆劃以符合一組輸入提示，該提示由原始文字描述擴充而來，擴充內容包含由視覺語言模型（VLM）所提取的風格描述。這些風格描述進一步產生非寫實的引導圖像，藉此引導補全更多內容筆劃。在第二階段，我們利用 VLM 將第一階段生成的筆劃調整為與輸入草圖風格一致，並透過一個迭代風格調整機制實現此目標。在每次迭代中，VLM 辨識輸入草圖與前一階段筆劃之間的風格差異，並將這些差異轉換為調整碼，用以更新筆劃。我們在各種草圖風格與文字提示下，將本方法與現有技術進行比較，並進行廣泛的消融研究、質性與量化評估，證實 AutoSketch 能支援多樣化的草圖創作情境。 Sketches are an important medium of expression and recently many works concentrate on automatic sketch creations. One such ability very useful for amateurs is text-based completion of a partial sketch to create a complex scene, while preserving the style of the partial sketch. Existing methods focus solely on generating sketch that match the content in the input prompt in a predefined style, ignoring the styles of the input partial sketches, e.g., the global abstraction level and local stroke styles. To address this challenge, we introduce AutoSketch, a style-aware vector sketch completion method that accommodates diverse sketch styles and supports iterative sketch completion. AutoSketch completes the input sketch in a style-consistent manner using a two-stage method. In the first stage, we initially optimize the strokes to match an input prompt augmented by style descriptions extracted from a vision-language model (VLM). Such style descriptions lead to non-photorealistic guidance images which enable more content to be depicted through new strokes. In the second stage, we utilize the VLM to adjust the strokes from the previous stage to adhere to the style present in the input partial sketch through an iterative style adjustment process. In each iteration, the VLM identifies a list of style differences between the input sketch and the strokes generated in the previous stage, translating these differences into adjustment codes to modify the strokes. We compare our method with existing methods using various sketch styles and prompts, perform extensive ablation studies and qualitative and quantitative evaluations, and demonstrate that AutoSketch can support diverse sketching scenarios.
URI:	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/98493
DOI:	10.6342/NTU202502832
全文授權:	未授權
電子全文公開日期:	N/A
顯示於系所單位：	資訊管理學系

文件中的檔案：

檔案	大小	格式
ntu-113-2.pdf 未授權公開取用	31.6 MB	Adobe PDF

顯示文件完整紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料（如：文字、圖片、PDF）並使其易於取用。