Hallucinations in large language models (LLMs) create heightened risks in multi-agent settings, where recursive agent interactions can propagate, reinforce, and amplify unsupported claims. This paper models hallucination as a system-level, time-evolving process across a network of interacting LLM agents, where nodes represent agents and edges encode information exchange. The proposed formulation captures how hallucinated claims diffuse through communication topologies, intensify under adversarial perturbations, and affect collective reliability across reasoning rounds. To suppress error propagation, we introduce an interaction-aware control method that combines confidence-weighted aggregation, adaptive impact regulation, external claim verification, and selective isolation of unreliable agents. Experiments on TruthfulQA and TriviaQA show that the proposed method reduces hallucination by up to 39.0% relative to undefended multi-agent reasoning, improves factual accuracy from 0.79 to 0.87, and increases semantic consistency from 0.75 to 0.84. Under adversarial conditions, the method limits hallucination amplification to 1.08, compared with 1.45 without adaptive control, maintaining stable collective behavior across recursive interaction rounds. These results indicate that hallucination in multi-agent LLM systems is governed by both individual model reliability and system-level interaction dynamics, including communication topology, confidence coupling, and recursive information flow.
翻译:大语言模型(LLM)中的幻觉在多智能体场景下会加剧风险,递归的智能体交互会传播、强化并放大未经证实的主张。本文将幻觉建模为跨智能体交互网络的时间演化系统级过程,其中节点代表智能体,边表示信息交换。所提出的公式捕捉了幻觉声明如何通过通信拓扑扩散、在对抗性扰动下加剧,以及影响多轮推理中的集体可靠性。为抑制错误传播,我们提出了一种交互感知控制方法,该方法结合了置信度加权聚合、自适应影响调节、外部声明验证以及对不可靠智能体的选择性隔离。在TruthfulQA和TriviaQA上的实验表明,与未加防御的多智能体推理相比,该方法可将幻觉降低高达39.0%,事实准确性从0.79提升至0.87,语义一致性从0.75提升至0.84。在对抗条件下,该方法将幻觉放大限制在1.08,而无可适应控制时为1.45,从而在递归交互轮次中保持稳定的集体行为。这些结果表明,多智能体LLM系统中的幻觉同时受个体模型可靠性和系统级交互动态(包括通信拓扑、置信度耦合和递归信息流)的支配。