This review paper takes a comprehensive look at malicious attacks against FL, categorizing them from new perspectives on attack origins and targets, and providing insights into their methodology and impact. In this survey, we focus on threat models targeting the learning process of FL systems. Based on the source and target of the attack, we categorize existing threat models into four types, Data to Model (D2M), Model to Data (M2D), Model to Model (M2M) and composite attacks. For each attack type, we discuss the defense strategies proposed, highlighting their effectiveness, assumptions and potential areas for improvement. Defense strategies have evolved from using a singular metric to excluding malicious clients, to employing a multifaceted approach examining client models at various phases. In this survey paper, our research indicates that the to-learn data, the learning gradients, and the learned model at different stages all can be manipulated to initiate malicious attacks that range from undermining model performance, reconstructing private local data, and to inserting backdoors. We have also seen these threat are becoming more insidious. While earlier studies typically amplified malicious gradients, recent endeavors subtly alter the least significant weights in local models to bypass defense measures. This literature review provides a holistic understanding of the current FL threat landscape and highlights the importance of developing robust, efficient, and privacy-preserving defenses to ensure the safe and trusted adoption of FL in real-world applications.
翻译:本文综述了针对联邦学习(FL)的恶意攻击,从攻击来源与目标的新视角对其进行分类,并深入剖析了其方法与影响。本综述聚焦于针对FL系统学习过程的威胁模型。根据攻击源与攻击目标,我们将现有威胁模型分为四类:数据到模型(D2M)、模型到数据(M2D)、模型到模型(M2M)以及复合攻击。针对每种攻击类型,我们讨论了所提出的防御策略,强调了其有效性、假设及潜在的改进方向。防御策略已从使用单一指标剔除恶意客户端,发展为在多个阶段对客户端模型进行多维度检查的方法。我们的研究表明,待学习数据、学习梯度以及不同阶段的学习模型均可能被操纵,从而发起恶意攻击,这些攻击包括破坏模型性能、重建私有本地数据以及植入后门。我们还发现这些威胁正变得愈发隐蔽。早期研究通常放大恶意梯度,而近期工作则通过细微修改本地模型中最不重要的权重来绕过防御措施。本文献综述提供了对当前FL威胁全景的整体理解,并强调了开发稳健、高效且保护隐私的防御机制的重要性,以确保FL在实际应用中安全可信地部署。