Adversarial attacks pose a major challenge to distributed learning systems, prompting the development of numerous robust learning methods. However, most existing approaches suffer from the curse of dimensionality, i.e. the error increases with the number of model parameters. In this paper, we make a progress towards high dimensional problems, under arbitrary number of Byzantine attackers. The cornerstone of our design is a direct high dimensional semi-verified mean estimation method. The idea is to identify a subspace with large variance. The components of the mean value perpendicular to this subspace are estimated using corrupted gradient vectors uploaded from worker machines, while the components within this subspace are estimated using auxiliary dataset. As a result, a combination of large corrupted dataset and small clean dataset yields significantly better performance than using them separately. We then apply this method as the aggregator for distributed learning problems. The theoretical analysis shows that compared with existing solutions, our method gets rid of $\sqrt{d}$ dependence on the dimensionality, and achieves minimax optimal statistical rates. Numerical results validate our theory as well as the effectiveness of the proposed method.
翻译:对抗性攻击对分布式学习系统构成重大挑战,推动了多种鲁棒学习方法的开发。然而,现有方法大多受限于维度灾难,即误差随模型参数数量增加而增大。本文针对任意数量拜占庭攻击者的场景,在高维问题上取得进展。我们设计的核心是一种直接的高维半验证均值估计方法。其思路是识别具有大方差的子空间:均值垂直于该子空间的分量通过工作机上传的受损梯度向量进行估计,而该子空间内的分量则通过辅助数据集估计。因此,将大规模受损数据集与小型干净数据集结合使用,能获得比单独使用任一数据集显著更优的性能。我们将此方法作为分布式学习问题的聚合器应用。理论分析表明,与现有方案相比,我们的方法摆脱了维度依赖中的$\\sqrt{d}$项,并达到了极小极大最优统计速率。数值实验结果验证了理论分析及所提方法的有效性。