神經網絡風格轉換演算法分析及人像風格應用

Deng-Jyun Wu; 吳登鈞

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/73364

標題:	神經網絡風格轉換演算法分析及人像風格應用 Neural style transfer algorithm analysis and portrait style application
作者:	Deng-Jyun Wu 吳登鈞
指導教授:	貝蘇章
關鍵字:	風格轉換,影像特徵,頭像轉換, Style transfer,image features,portrait style transfer,
出版年 :	2019
學位:	碩士
摘要:	本篇論文主要探討風格轉換演算法之間的比較，透過convolution neural network在影像上的在影像上進行特徵萃取，可以捕捉到影像的風格及結構資訊資訊，並以此進行重建，VGG-NET能幫助電腦視覺領域在影像中萃取到許多的特徵，並透過特徵的操作完成風格轉換的工作，可以透過許多參數的設定達到不同的風格效果，然而在人像細節上的失真，是我們發現的主要問題，這大幅的限制了它的應用，我們透過逐步處理不同區間，完成了風格轉換上容易產生風格頭像失真的情況。論文的第一個部分主要談討Gatys 在2015年提出充滿創意的想法。透過捲曲網絡進行風格轉換的演算法，利用VGG-NET 中不同捲曲層特徵進行重建以及 Gram-matrix的設計，得到了不錯的效果，並可以得知捲曲網絡在不同層次所觀察的影像特徵間的差異，根據不同層次的影像特徵，可以達到不同的轉換效果。這也使我們得知捲曲網絡可以得到豐富的豐富的特徵，也為風格轉換提出了新的想像然而計算的複雜度卻限制了它的應用，所以有許多演算法在改良執行速度下被提出。而本文的第二部分主要透過編碼及解碼器的設計、Whiting & Coloring、特徵映射的設計，亦利用VGG-NET 中不同捲曲層特徵進行重建，得到了不同的轉換效果。提高風格轉換的效率同時完成了任意風格轉換的可操作性。第三個部分則是我們發現在人像風格轉換容易失真的缺點，所以我們將影像進行切割，讓神經網絡能夠更準確地抓取重點影像的特徵，讓風格轉換能在日常裝置更容易的被運用及操作。因此我們提出的演算法是利用現行的universal style transfer via feature transforms ，利用影像的前處理，進行分割，並個別完成風格轉換後的重組影像，讓風格轉換演算法亦可在人像照片運用。 This paper mainly discusses the comparison between style conversion algorithms. Through the feature extraction on the image through the convolution neural network, the image style and structure information can be captured and reconstructed. VGG-NET can help the computer vision field extract a lot of features in the image, and through the operation of the feature to complete the style conversion work, can achieve different style effects through the setting of many parameters, but the distortion in the portrait details is the main problem we found. This greatly limits its application. By gradually processing different intervals, we have completed the situation where the style conversion is prone to distortion of the style avatar. The first part of the paper focuses on the idea that Getys proposed in 2015. The algorithm for style conversion through the curly network, using the different curl layer features in VGG-NET for reconstruction and Gram-matrix design, has achieved good results, and can know the image characteristics of the curl network observed at different levels. Differences, according to different levels of image features, can achieve different conversion effects. This also makes us know that the curly network can get rich and rich features, and also proposes new imagination for style conversion. However, the computational complexity limits its application, so many algorithms are proposed to improve the execution speed. The second part of the thesis mainly uses the coding and decoder design, Whiting & Coloring, feature mapping design, and also uses the different curl layer features in VGG-NET to reconstruct, and obtain different conversion effects. Improve the efficiency of style conversion while completing the operability of any style conversion. The third part is that we find that the portrait style conversion is easy to be distorted, so we cut the image so that the neural network can capture the features of the key image more accurately, so that the style conversion can be used more easily in everyday devices. And operation. Therefore, our proposed algorithm utilizes the current universal style transfer via feature transforms, uses image pre-processing, performs segmentation, and individually completes the reconstructed image after style conversion, so that the style conversion algorithm can also be used in portrait photos.
URI:	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/73364
DOI:	10.6342/NTU201900805
全文授權:	有償授權
顯示於系所單位：	電信工程學研究所

文件中的檔案：

檔案	大小	格式
ntu-108-1.pdf 目前未授權公開取用	34.93 MB	Adobe PDF

顯示文件完整紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料（如：文字、圖片、PDF）並使其易於取用。