LLM-based multi-agent systems (MAS) solve complex tasks through inter-agent collaboration, but their communication-driven nature also allows security risks to spread across agents and trigger system-wide failures. Existing MAS defenses mainly follow a reactive paradigm after execution by detecting and isolating harmful agents, which may cause irreversible damage and degrade collaborative utility. To address this, we propose a proactive defense framework for MAS security, namely a Simulation-aware Interception Guard (SAIGuard). SAIGuard performs communication-state simulation over the MAS interaction graph, estimates the impact of incoming messages on local agent states and the global MAS state, and detects risky messages via reconstruction deviations from benign communication patterns. Instead of isolating agents, SAIGuard sanitizes or regenerates suspicious messages before it propagation into system. Experiments across diverse topologies and attack scenarios show that SAIGuard reduces attack success rates while maintaining MAS utility, outperforming reactive defenses.
翻译:基于大语言模型的多智能体系统通过智能体间协作解决复杂任务,但其通信驱动的特性也使得安全风险能在智能体间扩散并引发系统性故障。现有MAS防御主要遵循执行后的反应式范型——通过检测并隔离有害智能体,这可能导致不可逆损害并降低协作效用。为解决此问题,我们提出一种面向MAS安全的主动防御框架,即仿真感知拦截防护机制。该机制在MAS交互图上执行通信状态模拟,评估传入消息对局部智能体状态及全局MAS状态的影响,并通过与良性通信模式的重构偏差检测风险消息。不同于隔离智能体,SAIGuard在可疑消息传播至系统前对其进行净化或再生处理。跨多种拓扑结构与攻击场景的实验表明,SAIGuard在维持MAS效用的同时降低了攻击成功率,性能优于反应式防御方法。