Aggregation rules are the cornerstone of distributed (or federated) learning in the presence of adversaries, under the so-called Byzantine threat model. They are also interesting mathematical objects from the point of view of robust mean estimation. The Krum aggregation rule has been extensively studied, and endowed with formal robustness and convergence guarantees. Yet, MultiKrum, a natural extension of Krum, is often preferred in practice for its superior empirical performance, even though no theoretical guarantees were available until now. In this work, we provide the first proof that MultiKrum is a robust aggregation rule, and bound its robustness coefficient. To do so, we introduce $κ^\star$, the optimal *robustness coefficient* of an aggregation rule, which quantifies the accuracy of mean estimation in the presence of adversaries in a tighter manner compared with previously adopted notions of robustness. We then construct an upper and a lower bound on MultiKrum's robustness coefficient. As a by-product, we also improve on the best-known bounds on Krum's robustness coefficient. We show that MultiKrum's bounds are never worse than Krum's, and better in realistic regimes. We illustrate this analysis by an experimental investigation on the quality of the lower bound.
翻译:聚合规则是分布式(或联邦)学习在存在对抗者(即所谓的拜占庭威胁模型)下的基石。从鲁棒均值估计的角度来看,它们也是有趣的数学对象。Krum聚合规则已被广泛研究,并获得了形式化的鲁棒性和收敛性保证。然而,MultiKrum作为Krum的自然扩展,尽管此前一直缺乏理论保证,但在实践中因其更优越的经验性能而常被优先采用。在本工作中,我们首次证明了MultiKrum是一种鲁棒聚合规则,并界定了其鲁棒系数。为此,我们引入了$κ^\star$——聚合规则的最优*鲁棒系数*,与先前采用的鲁棒性概念相比,它能更紧密地量化存在对抗者时均值估计的准确性。随后,我们构建了MultiKrum鲁棒系数的上界和下界。作为副产品,我们还改进了已知的Krum鲁棒系数的最佳边界。我们证明MultiKrum的边界从不劣于Krum,并在实际场景中更优。我们通过实验研究下界质量来验证这一分析。