Distributed gradient descent algorithms have come to the fore in modern machine learning, especially in parallelizing the handling of large datasets that are distributed across several workers. However, scant attention has been paid to analyzing the behavior of distributed gradient descent algorithms in the presence of adversarial corruptions instead of random noise. In this paper, we formulate a novel problem in which adversarial corruptions are present in a distributed learning system. We show how to use ideas from (lazy) mirror descent to design a corruption-tolerant distributed optimization algorithm. Extensive convergence analysis for (strongly) convex loss functions is provided for different choices of the stepsize. We carefully optimize the stepsize schedule to accelerate the convergence of the algorithm, while at the same time amortizing the effect of the corruption over time. Experiments based on linear regression, support vector classification, and softmax classification on the MNIST dataset corroborate our theoretical findings.
翻译:分布式梯度下降算法在现代机器学习中日益重要,特别是在并行处理分布于多个工作节点的大型数据集方面。然而,现有研究很少关注分布式梯度下降算法在存在对抗性破坏(而非随机噪声)时的行为分析。本文提出了一种新的问题模型,即在分布式学习系统中存在对抗性破坏的情况。我们展示了如何利用(惰性)镜像下降的思想设计一种容错的分布式优化算法。针对(强)凸损失函数,我们为不同步长选择提供了详尽的收敛性分析。通过精心优化步长调度策略,我们在加速算法收敛的同时,使破坏效应随时间推移得到分摊。基于线性回归、支持向量分类以及MNIST数据集上的Softmax分类实验验证了我们的理论发现。