Byzantine-robust learning has emerged as a prominent fault-tolerant distributed machine learning framework. However, most techniques consider the static setting, wherein the identity of Byzantine machines remains fixed during the learning process. This assumption does not capture real-world dynamic Byzantine behaviors, which may include transient malfunctions or targeted temporal attacks. Addressing this limitation, we propose $\textsf{DynaBRO}$ -- a new method capable of withstanding $\mathcal{O}(\sqrt{T})$ rounds of Byzantine identity alterations (where $T$ is the total number of training rounds), while matching the asymptotic convergence rate of the static setting. Our method combines a multi-level Monte Carlo (MLMC) gradient estimation technique with robust aggregation of worker updates and incorporates a fail-safe filter to limit bias from dynamic Byzantine strategies. Additionally, by leveraging an adaptive learning rate, our approach eliminates the need for knowing the percentage of Byzantine workers.
翻译:拜占庭鲁棒学习已成为一种突出的容错分布式机器学习框架。然而,大多数技术考虑的是静态场景,其中拜占庭机器的身份在学习过程中保持不变。这一假设未能捕捉现实世界中动态的拜占庭行为,这些行为可能包括瞬时故障或针对性的时间攻击。为克服这一局限,我们提出$\textsf{DynaBRO}$——一种新方法,能够承受$\mathcal{O}(\sqrt{T})$轮拜占庭身份变更(其中$T$是总训练轮数),同时匹配静态场景的渐近收敛速率。我们的方法将多层蒙特卡洛(MLMC)梯度估计技术与工作节点更新的鲁棒聚合相结合,并纳入一个故障安全滤波器以限制动态拜占庭策略带来的偏差。此外,通过利用自适应学习率,我们的方法无需知晓拜占庭工作节点的比例。