穩健聯邦式機器學習於惡意攻擊者和消極參與者之情境

陳智泓; Chih-Hung Chen

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/87528

標題:	穩健聯邦式機器學習於惡意攻擊者和消極參與者之情境 Robust Federated Learning in the Presence of Byzantine and Straggling Clients
作者:	陳智泓 Chih-Hung Chen
指導教授:	王奕翔 I-Hsiang Wang
關鍵字:	聯邦式機器學習,分散式機器學習,拜占庭惡意攻擊容許,消極參與容許,故障容許,隨機最佳化, Federated learning,Distributed learning,Byzantine-robustness,Straggler-robustness,Fault-tolerance,Stochastic optimization,
出版年 :	2023
學位:	碩士
摘要:	在拜占庭聯邦式機器學習的問題設定中，有一個參數伺服器(parameter server)負責協調多個客戶 (clients) --也稱作工作機器 (worker machines)-- 去訓練出一個機器學習模型。然而，有一部分的客戶可能是拜占庭 (Byzantine client) --也稱作惡意攻擊者--，這些惡意攻擊者會傳送錯誤的運算結果給伺服器，目的是避免訓練表現的收斂。本論文考慮一個實際且具有挑戰性的設定，除了惡意攻擊者之外，其他非惡意攻擊者的客戶可能是消極參與者 (straggling client)。這些消極參與者雖然會按照伺服器的請求 (request)，回覆正確的運算結果，但是因為運算速度和網路延遲等因素，消極參與者無法回覆伺服器的所有請求。儘管從實務的角度，同時容許惡意攻擊者和消極參與者的聯邦式學習問題至關重要，我們透過文獻回顧發現既有的演算法並不讓人滿意，主要的原因有以下兩者。首先，對於那些宣稱能夠同時容許惡意攻擊者和消極參與者的既有演算法，他們在理論上的錯誤率並不能保證會隨著迭代次數而遞減。再者，對於那些只考慮惡意攻擊者的既有演算法，在我們設計的實驗設定之下，他們的模型錯誤率會隨著迭代次數而發散。這些只考慮惡意攻擊的演算法之所以會失敗的原因，其實是因為落後者的存在造成非惡意攻擊者們的運算結果之間的變異量變大，因此，間接使得惡意攻擊者有更多的空間去設計錯誤的回傳結果，來達到影響力更大的攻擊。本論文也利用這個特性，設計了新的拜占庭攻擊去規避既有的聯邦式學習演算法。為了要實現穩健的聯邦式機器學習演算法，我們提出了一個以填補為概念的方法叫作 FILLUP ，使用不同的向量去填補消極參與者來不及回覆的運算結果，以削弱消極參與者的影響。此外，針對那些獨立和同分布 (independent and identically distributed) 的消極參與者，我們提出了 REUPLOAD 方法，透過安排客戶們的重傳次數，來削弱惡意參與者的影響。在理論和實務上，我們利用 FILLUP 和 REUPLOAD 方法，讓只考慮惡意攻擊的既有演算法延伸到有消極參與者的問題設定。假設隨機梯度的雜訊的變異量有上界，並考慮非凸函數 (non-convex) 的最佳化問題，無論消極參與者是獨立且同分布的或是敵對的 (adversarial)，我們提出的方法都保證理論上的錯誤率會隨著迭代次數遞減。假設隨機梯度和母體梯度的差有上界，並考慮一般凸函數(generally-convex) 或是強凸函數 (strongly-convex) 的最佳化問題，在消極參與者是獨立且同分布的情況下，我們提出的方法保證理論上的錯誤率會隨著迭代次數遞減，並且吻合下界的次第 (order)。我們也在實務的資料集設計實驗，實驗結果顯示儘管在有消極參與者的情況下，我們提出的方法能有效地抵擋許多不同的拜占庭惡意攻擊。 In Byzantine-robust federated learning, a parameter server coordinates multiple clients --known as worker machines-- to train a learning model. However, a fraction of these clients may be compromised (called Byzantines), which may send incorrect responses to the server with the aim to prevent the convergence of the learning process. In this work, we consider a more practical but more challenging setting that allows the non-Byzantine clients to be straggling clients, or stragglers, which always response correctly but fail to respond to a proportion of the requests from the server. Although robustness in the presence of both kinds of faulty clients (Byzantine and stragglers) is a crucial issue in a real-world federated learning system, we argue that none of existing robust algorithms are satisfactory, due to the following two-fold reason. First, for those existing algorithms which tolerate both kinds of faulty clients, all of them cannot guarantee a vanishing error rate in theory. Secondly, for those existing algorithms which only tolerate Byzantine clients, all of them exhibit divergent error under some mild straggler-presence settings in our empirical evaluation. The underlying reason for the failure of existing Byzantine-robust algorithms is that the presence of stragglers leads to an extra variance among the updates of non-Byzantine clients, which leaves more room for Byzantine attacks. In particular, we design a new simple Byzantine attack that utilizes the advantage of the presence of stragglers and circumvent existing Byzantine-robust algorithms. To fulfill a true sense of robustness against both faulty clients, we propose FILLUP , a filler-based scheme that fills up the missing updates from stragglers to mitigate the extra variance incurred from stragglers. Moreover, we propose REUPLOAD , which is a client-scheduling scheme tailored for the stragglers whose straggling pattern is independent and identically distributed (i.i.d.) across the requests from the server. Our schemes can adapt existing robust algorithms to straggler-presence setting both in theory and in practice. With the variance-bounded noise assumption, our algorithm guarantees a vanishing error rate against both adversarial and i.i.d. stragglers for non-convex objective functions. With the deviation-bounded noise assumption, our algorithm guarantees a tight, vanishing error rate against i.i.d. stragglers for general-convex and strongly-convex objective functions. Experimental results show that our schemes are effective against various attacks for real-world datasets.
URI:	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/87528
DOI:	10.6342/NTU202300335
全文授權:	同意授權(限校園內公開)
電子全文公開日期:	2028-02-10
顯示於系所單位：	電信工程學研究所

文件中的檔案：

檔案	大小	格式
ntu-111-1.pdf 未授權公開取用	1.02 MB	Adobe PDF	檢視/開啟

顯示文件完整紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料（如：文字、圖片、PDF）並使其易於取用。