Generalization Error Matters in Decentralized Learning Under Byzantine Attacks

Recently, decentralized learning has emerged as a popular peer-to-peer signal and information processing paradigm that enables model training across geographically distributed agents in a scalable manner, without the presence of any central server. When some of the agents are malicious (also termed as Byzantine), resilient decentralized learning algorithms are able to limit the impact of these Byzantine agents without knowing their number and identities, and have guaranteed optimization errors. However, analysis of the generalization errors, which are critical to implementations of the trained models, is still lacking. In this paper, we provide the first analysis of the generalization errors for a class of popular Byzantine-resilient decentralized stochastic gradient descent (DSGD) algorithms. Our theoretical results reveal that the generalization errors cannot be entirely eliminated because of the presence of the Byzantine agents, even if the number of training samples are infinitely large. Numerical experiments are conducted to confirm our theoretical results.

翻译：近年来，分布式学习作为一种流行的点对点信号与信息处理范式，能够在无中心服务器的情况下，以可扩展的方式实现地理分布式智能体间的模型训练。当部分智能体存在恶意行为（亦称为拜占庭节点）时，鲁棒的分布式学习算法能够在不知晓恶意节点数量与身份的条件下限制其影响，并保证优化误差的收敛性。然而，对于训练模型部署至关重要的泛化误差分析仍属空白。本文首次对一类主流的拜占庭鲁棒分布式随机梯度下降（DSGD）算法进行了泛化误差分析。理论结果表明，由于拜占庭节点的存在，即使训练样本数量趋于无穷，泛化误差也无法完全消除。数值实验验证了理论分析的正确性。

相关内容

泛化误差

关注 107

学习方法的泛化能力（Generalization Error）是由该方法学习到的模型对未知数据的预测能力，是学习方法本质上重要的性质。现实中采用最多的办法是通过测试泛化误差来评价学习方法的泛化能力。泛化误差界刻画了学习算法的经验风险与期望风险之间偏差和收敛速度。一个机器学习的泛化误差（Generalization Error），是一个描述学生机器在从样品数据中学习之后，离教师机器之间的差距的函数。

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日