Robustness to Byzantine attacks is a necessity for various distributed training scenarios. When the training reduces to the process of solving a minimization problem, Byzantine robustness is relatively well-understood. However, other problem formulations, such as min-max problems or, more generally, variational inequalities, arise in many modern machine learning and, in particular, distributed learning tasks. These problems significantly differ from the standard minimization ones and, therefore, require separate consideration. Nevertheless, only one work (Adibi et al., 2022) addresses this important question in the context of Byzantine robustness. Our work makes a further step in this direction by providing several (provably) Byzantine-robust methods for distributed variational inequality, thoroughly studying their theoretical convergence, removing the limitations of the previous work, and providing numerical comparisons supporting the theoretical findings.
翻译:针对拜占庭攻击的鲁棒性是多种分布式训练场景的必要条件。当训练简化为求解最小化问题的过程时,拜占庭鲁棒性已得到相对充分的理解。然而,其他问题形式,如最小-最大问题,或更一般化的变分不等式,出现在许多现代机器学习尤其是分布式学习任务中。这些问题与标准最小化问题存在显著差异,因此需要单独考虑。尽管如此,仅有 Adibi 等人(2022 年)的一项工作在拜占庭鲁棒性背景下探讨了这一重要问题。我们的工作在此方向上更进一步,提供了多种(可证明的)拜占庭鲁棒的分布式变分不等式方法,深入研究了它们的理论收敛性,消除了先前工作的局限性,并提供了支持理论发现的数值比较结果。