預見完美深度偽造: 基於身份對瑕疵無關的偵測

王韋翰; Wei-Han Wang

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/91571

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	陳銘憲	zh_TW
dc.contributor.advisor	Ming-Syan Chen	en
dc.contributor.author	王韋翰	zh_TW
dc.contributor.author	Wei-Han Wang	en
dc.date.accessioned	2024-01-28T16:34:55Z	-
dc.date.available	2024-01-29	-
dc.date.copyright	2024-01-28	-
dc.date.issued	2023	-
dc.date.submitted	2023-08-08	-
dc.identifier.citation	[1] Darius Afchar, Vincent Nozick, Junichi Yamagishi, and Isao Echizen. Mesonet: a compact facial video forgery detection network. In 2018 IEEE international workshop on information forensics and security (WIFS), pages 1–7. IEEE, 2018. [2] Madhav Agarwal, Rudrabha Mukhopadhyay, Vinay P Namboodiri, and CV Jawahar. Audio-visual face reenactment. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 5178–5187, 2023. [3] Shruti Agarwal, Hany Farid, Tarek El-Gaaly, and Ser-Nam Lim. Detecting deepfake videos from appearance and behavior. In 2020 IEEE international workshop on information forensics and security (WIFS), pages 1–6. IEEE, 2020. [4] Shruti Agarwal, Hany Farid, Yuming Gu, Mingming He, Koki Nagano, and Hao Li. Protecting world leaders against deep fakes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, June 2019. [5] Shivangi Aneja, Lev Markhasin, and Matthias Nießner. Tafim: Targeted adversarial attacks against facial image manipulations. In Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XIV, pages 58–75. Springer, 2022. [6] Vishal Asnani, Xi Yin, Tal Hassner, Sijia Liu, and Xiaoming Liu. Proactive image manipulation detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 15386–15395, 2022. [7] Tadas Baltrusaitis. openface source code, 2018. last accessed: May 4, 2023. [8] Tadas Baltrusaitis, Amir Zadeh, Yao Chong Lim, and Louis-Philippe Morency. Openface 2.0: Facial behavior analysis toolkit. In 2018 13th IEEE international conference on automatic face & gesture recognition (FG 2018), pages 59–66. IEEE, 2018. [9] Sam Bond-Taylor, Adam Leach, Yang Long, and Chris G Willcocks. Deep generative modelling: A comparative review of vaes, gans, normalizing flows, energybased and autoregressive models. IEEE transactions on pattern analysis and machine intelligence, 2021. [10] Lucy Chai, David Bau, Ser-Nam Lim, and Phillip Isola. What makes fake images detectable? understanding properties that generalize. In Computer Vision ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXVI 16, pages 103–120. Springer, 2020. [11] Liang Chen, Yong Zhang, Yibing Song, Jue Wang, and Lingqiao Liu. Ost: Improving generalization of deepfake detection via one-shot test-time training. In Advances in Neural Information Processing Systems, volume 35, pages 24597 24610, 2022. [12] Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. A simple framework for contrastive learning of visual representations. In International conference on machine learning, pages 1597–1607. PMLR, 2020. [13] Kyunghyun Cho, Bart van Merrienboer, Çaglar Gülçehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. Learning phrase representations using RNN encoder-decoder for statistical machine translation. In Alessandro Moschitti, Bo Pang, and Walter Daelemans, editors, Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP 2014, October 25-29, 2014, Doha, Qatar, A meeting of SIGDAT, a Special Interest Group of the ACL, pages 1724–1734. ACL, 2014. [14] François Chollet. Xception: Deep learning with depthwise separable convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1251–1258, 2017. [15] Faceswap (Github community repository. faceswap, December 2017. last accessed: May 4, 2023. [16] Riccardo Corvi, Davide Cozzolino, Giada Zingarini, Giovanni Poggi, Koki Nagano, and Luisa Verdoliva. On the detection of synthetic images generated by diffusion models. In ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 1–5. IEEE, 2023. [17] Davide Cozzolino, Andreas Rössler, Justus Thies, Matthias Nießner, and Luisa Verdoliva. Id-reveal: Identity-aware deepfake video detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 15108–15117, October 2021. [18] Prafulla Dhariwal and Alexander Nichol. Diffusion models beat gans on image synthesis. Advances in Neural Information Processing Systems, 34:8780–8794, 2021. [19] Xiaoyi Dong, Jianmin Bao, Dongdong Chen, Ting Zhang, Weiming Zhang, Nenghai Yu, Dong Chen, Fang Wen, and Baining Guo. Protecting celebrities from deepfake with identity consistency transformer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 9468–9478, June 2022. [20] Michail Christos Doukas, Stefanos Zafeiriou, and Viktoriia Sharmanska. Headgan: One-shot neural head synthesis and editing. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 14398–14407, 2021. [21] Ricard Durall, Margret Keuper, and Janis Keuper. Watch your up-convolution: Cnn based generative deep neural networks are failing to reproduce spectral distributions. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 7890–7899, 2020. [22] Paul Ekman and Wallace V Friesen. Facial action coding system. Environmental Psychology & Nonverbal Behavior, 1978. [23] Jiazhi Guan, Hang Zhou, Zhibin Hong, Errui Ding, Jingdong Wang, Chengbin Quan, and Youjian Zhao. Delving into sequential patches for deepfake detection. In Advances in Neural Information Processing Systems, volume 35, pages 4517 4530, 2022. [24] Alexandros Haliassos, Konstantinos Vougioukas, Stavros Petridis, and Maja Pantic. Lips don’t lie: A generalisable and robust approach to face forgery detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 5039–5049, 2021. [25] Yonghyun Jeong, Doyeon Kim, Seungjai Min, Seongho Joe, Youngjune Gwon, and Jongwon Choi. Bihpf: bilateral high-pass filters for robust deepfake detection. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 48–57, 2022. [26] Xinya Ji, Hang Zhou, Kaisiyuan Wang, Wayne Wu, Chen Change Loy, Xun Cao, and Feng Xu. Audio-driven emotional video portraits. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 14080–14089, 2021. [27] Tackhyun Jung, Sangwon Kim, and Keecheon Kim. Deepvision: Deepfakes detection using human eye blinking pattern. IEEE Access, 8:83144–83154, 2020. [28] Diederik P. Kingma and Jimmy Ba. Adam: A method for stochastic optimization. In Yoshua Bengio and Yann LeCun, editors, 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings, 2015. [29] Dymples Leong. Deepfakes and disinformation pose a growing threat in asia, March 2023. last accessed: May 4, 2023. [30] Jiaming Li, Hongtao Xie, Jiahong Li, Zhongyuan Wang, and Yongdong Zhang. Frequency-aware discriminative feature learning supervised by single center loss for face forgery detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 6458–6467, 2021. [31] Lingzhi Li, Jianmin Bao, Ting Zhang, Hao Yang, Dong Chen, Fang Wen, and Baining Guo. Face x-ray for more general face forgery detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 5001–5010, 2020. [32] Yuezun Li, Ming-Ching Chang, and Siwei Lyu. In ictu oculi: Exposing ai created fake videos by detecting eye blinking. In 2018 IEEE international workshop on information forensics and security (WIFS), pages 1–7. IEEE, 2018. [33] Yuezun Li and Siwei Lyu. Exposing deepfake videos by detecting face warping artifacts. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, pages 46–52. Computer Vision Foundation/ IEEE, 2019. [34] Yuezun Li, Xin Yang, Pu Sun, Honggang Qi, and Siwei Lyu. Celeb-df: A largescale challenging dataset for deepfake forensics. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 3207 3216, 2020. [35] Baoping Liu, Bo Liu, Ming Ding, Tianqing Zhu, and Xin Yu. Ti2net: Temporal identity inconsistency network for deepfake detection. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 4691–4700, 2023. [36] Honggu Liu, Xiaodan Li, Wenbo Zhou, Yuefeng Chen, Yuan He, Hui Xue, Weiming Zhang, and Nenghai Yu. Spatial-phase shallow learning: rethinking face forgery detection in frequency domain. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 772–781, 2021. [37] Sophie Maddocks.｀a deepfake porn plot intended to silence me＇: exploring continuities between pornographic and｀political＇deep fakes. Porn Studies, 7(4):415–423, 2020. [38] Arsha Nagrani, Joon Son Chung, and Andrew Zisserman. Voxceleb: A large scale speaker identification dataset. In Francisco Lacerda, editor, Interspeech 2017, 18th Annual Conference of the International Speech Communication Association, Stockholm, Sweden, August 20-24, 2017, pages 2616 2620. ISCA, 2017. [39] NBC News. Deepfake porn booms in the age of a.i., April 2023. last accessed: May 4, 2023. [40] Huy H Nguyen, Junichi Yamagishi, and Isao Echizen. Capsule-forensics: Using capsule networks to detect forged images and videos. In ICASSP 2019 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 2307–2311. IEEE, 2019. [41] Ivan Perov. Deepfacelab, June 2018. last accessed: May 4, 2023. [42] Yuyang Qian, Guojun Yin, Lu Sheng, Zixuan Chen, and Jing Shao. Thinking in frequency: Face forgery detection by mining frequency-aware clues. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23 28, 2020, Proceedings, Part XII, pages 86–103. Springer, 2020. [43] Aditya Ramesh, Prafulla Dhariwal, Alex Nichol, Casey Chu, and Mark Chen. Hierarchical text-conditional image generation with clip latents. arXiv preprint arXiv:2204.06125, 2022. [44] Andreas Rossler, Davide Cozzolino, Luisa Verdoliva, Christian Riess, Justus Thies, and Matthias Nießner. Faceforensics++: Learning to detect manipulated facial images. In Proceedings of the IEEE/CVF international conference on computer vision, pages 1–11, 2019. [45] Kaede Shiohara and Toshihiko Yamasaki. Detecting deepfakes with self blended images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 18720–18729, 2022. [46] Changyong Shu, Hemao Wu, Hang Zhou, Jiaming Liu, Zhibin Hong, Changxing Ding, Junyu Han, Jingtuo Liu, Errui Ding, and Jingdong Wang. Few-shot head swapping in the wild. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10789–10798, 2022. [47] Aliaksandr Siarohin, Stéphane Lathuilière, Sergey Tulyakov, Elisa Ricci, and Nicu Sebe. First order motion model for image animation. Advances in Neural Information Processing Systems, 32, 2019. [48] Mingxing Tan and Quoc Le. Efficientnet: Rethinking model scaling for convolutional neural networks. In International conference on machine learning, pages 6105–6114. PMLR, 2019. [49] Laurens Van der Maaten and Geoffrey Hinton. Visualizing data using t sne. Journal of machine learning research, 9(11), 2008. [50] Sheng-Yu Wang, Oliver Wang, Richard Zhang, Andrew Owens, and Alexei A Efros. Cnn-generated images are surprisingly easy to spot... for now. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 8695–8704, 2020. [51] Mika Westerlund. The emergence of deepfake technology: A review. Technology innovation management review, 9(11), 2019. [52] Zhiliang Xu, Zhibin Hong, Changxing Ding, Zhen Zhu, Junyu Han, Jingtuo Liu, and Errui Ding. Mobilefaceswap: A lightweight framework for video face swapping. In Proceedings of the AAAI Conference on Artificial Intelligence, 2022. [53] Xin Yang, Yuezun Li, and Siwei Lyu. Exposing deep fakes using inconsistent head poses. In ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 8261–8265. IEEE, 2019. [54] Chin-Yuan Yeh, Hsi-Wen Chen, Hong-Han Shuai, De-Nian Yang, and Ming-Syan Chen. Attack as the best defense: Nullifying image-to-image translation gans via limit-aware adversarial attack. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 16188–16197, 2021. [55] Chin-Yuan Yeh, Hsi-Wen Chen, Shang-Lun Tsai, and Sheng-De Wang. Disrupting image-translation-based deepfake algorithms with adversarial attacks. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision Workshops, pages 53–62, 2020. [56] Ning Yu, Larry S Davis, and Mario Fritz. Attributing fake images to gans: Learning and analyzing gan fingerprints. In Proceedings of the IEEE/CVF international conference on computer vision, pages 7556–7566, 2019. [57] Ning Yu, Vladislav Skripniuk, Sahar Abdelnabi, and Mario Fritz. Artificial fingerprinting for generative models: Rooting deepfake attribution in training data. In Proceedings of the IEEE/CVF International conference on computer vision, pages 14448–14457, 2021. [58] Hanqing Zhao, Wenbo Zhou, Dongdong Chen, Tianyi Wei, Weiming Zhang, and Nenghai Yu. Multi-attentional deepfake detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 2185–2194, 2021. [59] Tianchen Zhao, Xiang Xu, Mingze Xu, Hui Ding, Yuanjun Xiong, and Wei Xia. Learning self-consistency for deepfake detection. In Proceedings of the IEEE/CVF international conference on computer vision, pages 15023–15033, 2021. [60] Yuan Zhao, Bo Liu, Ming Ding, Baoping Liu, Tianqing Zhu, and Xin Yu. Proactive deepfake defence via identity watermarking. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 4602 4611, 2023. [61] Liang Zheng, Liyue Shen, Lu Tian, Shengjin Wang, Jingdong Wang, and Qi Tian. Scalable person re-identification: A benchmark. In Proceedings of the IEEE international conference on computer vision, pages 1116–1124, 2015. [62] Yinglin Zheng, Jianmin Bao, Dong Chen, Ming Zeng, and Fang Wen. Exploring temporal coherence for more general video face forgery detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 15044–15054, 2021. [63] Yipin Zhou and Ser-Nam Lim. Joint audio-visual deepfake detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 14800–14809, 2021. [64] Bojia Zi, Minghao Chang, Jingjing Chen, Xingjun Ma, and Yu-Gang Jiang. Wilddeepfake: A challenging real-world dataset for deepfake detection. In Proceedings of the 28th ACM international conference on multimedia, pages 2382–2390, 2020.	-
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/91571	-
dc.description.abstract	現今的深偽影片檢測器在不平衡的測試環境中表現優異，因為其目標是區分生成的影片與真實影片。這些先前的方法都重度依賴深偽演算法中生成的噪音，或者是所謂的生成瑕疵。然而，隨著生成模型的快速進步，我們預期深度偽造技術將進化到接近完美的境界，也就是說不存在可辨識的生成瑕疵。為了減少對生成瑕疵的依賴，我們設計了一種名為”Rebalanced Deepfake Detection Protocol (RDDP)”的方法來平衡偽造和真實樣本之間的生成噪音，並進一步提出兩種變體。第一種RDDP-WHITEHAT 使用白帽算法，利用同一人的圖像來生成深偽影片。這些自我偽造的視頻，儘管包含深度偽造的瑕疵，但由於他們的自我相似性，被視為是真實的樣本。第二種RDDP-SURROGATE 利用不同的函數來後處理偽造和真實的樣本，賦予相同類型的噪音。這種方法在不依賴深度偽造算法的情況下，使測試環境更加公平。為了檢測沒有生成瑕疵的深度偽造，我們引入了一種名為ID-Miner 的檢測器，它的設計目的是識別深偽影片人物背後的操控者。我們的模型強調臉部動作並忽略生成瑕疵或人物外觀，是一種利用身份驗證的深偽影片檢測器：它通過比較給定影片與參考影片的特徵表示來驗證其真實性。基於影格層級的”artifact-agnostic loss”（不考慮生成瑕疵的損失）和基於視頻層級的”identity-anchored loss”（身份固定損失），ID-Miner 有效地在生成瑕疵和不同的外觀變化中隔離出特徵性的身份訊息。在三種測試協議和兩個深度偽造數據集下，與十二個基線檢測器的比較實驗，以及額外的質化研究，均證明了我們的方法的優勢和對設計以防止完美深偽影片檢測器的必要性。	zh_TW
dc.description.abstract	State-of-the-art deepfake detectors excel primarily in unbalanced environments because the goal is to distinguish between generated videos and real ones. Consequently, all prior methods, intentionally or inadvertently, heavily rely on generative noise, or, artifacts. However, as deep generative models rapidly improve, we anticipate the evolution of deepfakes towards a state of “perfection” where no discernible artifacts exist. To reduce reliance on artifacts, we design the Rebalanced Deepfake Detection Protocol (RDDP) that “balances” the existence of generative noise between forged and real examples, with two variants based on the availability of a “white-hat” deepfake algorithm. Particularly, RDDP-WHITEHAT employs a white-hat algorithm to reconstruct genuine portrait videos using images of the same subject. These self-deepfakes, while containing deepfake artifacts, are considered “genuine” due to their self-likeness. On the other hand, RDDP-SURROGATE exploits surrogate functions to process both forged and genuine examples, imbuing the same type of noise. This variant levels the playing field without resorting to deepfake algorithms. Toward detecting deepfakes without artifacts, we introduce ID-Miner, a detector designed to discern the puppeteer behind the disguise. By emphasizing motion and disregarding artifacts or appearances, our model functions as an identity-based detector: it verifies the authenticity of a given video by comparing its representation with that of a reference video.	en
dc.description.provenance	Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2024-01-28T16:34:55Z No. of bitstreams: 0	en
dc.description.provenance	Made available in DSpace on 2024-01-28T16:34:55Z (GMT). No. of bitstreams: 0	en
dc.description.tableofcontents	誌謝i 摘要ii Abstract iv Contents vi List of Figures viii List of Tables xi 1 Introduction 1 2 Related Work 5 3 Approach 7 3.1 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . 7 3.2 ID-Miner . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 4 Experiment 15 4.1 Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 4.2 Baseline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 4.3 Conventional and Rebalanced Deepfake Detection Protocol Evaluations 18 4.4 Qualitative Assessment . . . . . . . . . . . . . . . . . . . . . . . . . 22 4.5 Ablation Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 4.6 Training Detail . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 4.7 Testing Detail . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 4.8 Cost Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 4.9 Data Sample Visualizations . . . . . . . . . . . . . . . . . . . . . . 30 5 Discussion and Conclusion 35 References 39	-
dc.language.iso	en	-
dc.subject	深度偽造偵測	zh_TW
dc.subject	深度偽造	zh_TW
dc.subject	Deepfake detection	en
dc.subject	Deepfake	en
dc.title	預見完美深度偽造: 基於身份對瑕疵無關的偵測	zh_TW
dc.title	In Anticipation of Perfect Deepfake: Identity-anchored Artifact-agnostic Detection	en
dc.type	Thesis	-
dc.date.schoolyear	111-2	-
dc.description.degree	碩士	-
dc.contributor.oralexamcommittee	陳祝嵩;沈之涯;彭文志	zh_TW
dc.contributor.oralexamcommittee	Chu-Song Chen;Chih-Ya Shen;Wen-Chih Peng	en
dc.subject.keyword	深度偽造,深度偽造偵測,	zh_TW
dc.subject.keyword	Deepfake,Deepfake detection,	en
dc.relation.page	48	-
dc.identifier.doi	10.6342/NTU202301341	-
dc.rights.note	未授權	-
dc.date.accepted	2023-08-09	-
dc.contributor.author-college	電機資訊學院	-
dc.contributor.author-dept	電機工程學系	-
顯示於系所單位：	電機工程學系

文件中的檔案：

檔案	大小	格式
ntu-111-2.pdf 未授權公開取用	5.22 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。