Robust Federated Learning for Malicious Clients using Loss Trend Deviation Detection

Federated Learning (FL) facilitates collaborative model training among distributed clients while ensuring that raw data remains on local devices.Despite this advantage, FL systems are still exposed to risks from malicious or unreliable participants. Such clients can interfere with the training process by sending misleading updates, which can negatively affect the performance and reliability of the global model. Many existing defense mechanisms rely on gradient inspection, complex similarity computations, or cryptographic operations, which introduce additional overhead and may become unstable under non-IID data distributions. In this paper, we propose the Federated Learning with Loss Trend Detection (FL-LTD), a lightweight and privacy-preserving defense framework that detects and mitigates malicious behavior by monitoring temporal loss dynamics rather than model gradients. The proposed approach identifies anomalous clients by detecting abnormal loss stagnation or abrupt loss fluctuations across communication rounds. To counter adaptive attackers, a short-term memory mechanism is incorporated to sustain mitigation for clients previously flagged as anomalous, while enabling trust recovery for stable participants. We evaluate FL-LTD on a non-IID federated MNIST setup under loss manipulation attacks. Experimental results demonstrate that the proposed method significantly enhances robustness, achieving a final test accuracy of 0.84, compared to 0.41 for standard FedAvg under attack. FL-LTD incurs negligible computational and communication overhead, maintains stable convergence, and avoids client exclusion or access to sensitive data, highlighting the effectiveness of loss-based monitoring for secure federated learning.

翻译：联邦学习（FL）使得分布式客户端能够协作训练模型，同时确保原始数据保留在本地设备上。尽管具有这一优势，联邦学习系统仍面临来自恶意或不可靠参与者的风险。此类客户端可通过发送误导性更新干扰训练过程，从而对全局模型的性能和可靠性产生负面影响。许多现有防御机制依赖于梯度检查、复杂的相似性计算或密码学操作，这些方法引入了额外开销，并可能在非独立同分布（non-IID）数据下变得不稳定。本文提出基于损失趋势检测的联邦学习（FL-LTD），这是一种轻量级且保护隐私的防御框架，通过监测时序损失动态而非模型梯度来检测和缓解恶意行为。该方法通过检测通信轮次中异常的损失停滞或突发损失波动来识别异常客户端。为应对自适应攻击，框架引入了短期记忆机制，以维持对先前标记为异常的客户端的缓解措施，同时允许稳定参与者恢复信任。我们在损失操纵攻击下的非独立同分布联邦MNIST设置中评估FL-LTD。实验结果表明，所提方法显著增强了鲁棒性，最终测试精度达到0.84，而受攻击的标准FedAvg仅为0.41。FL-LTD产生的计算和通信开销可忽略不计，保持稳定收敛，且无需排除客户端或访问敏感数据，凸显了基于损失监测的安全联邦学习的有效性。