Hallucination remains a critical bottleneck for large language models (LLMs), undermining their reliability in real-world applications, especially in Retrieval-Augmented Generation (RAG) systems. While existing hallucination detection methods employ LLM-as-a-judge to verify LLM outputs against retrieved evidence, they suffer from inherent confirmation bias, where the verifier inadvertently reproduces the errors of the original generation. To address this, we introduce Multi-Agent Reinforced Self-Check for Hallucination (MARCH), a framework that enforces rigorous factual alignment by leveraging deliberate information asymmetry. MARCH orchestrates a collaborative pipeline of three specialized agents: a Solver, a Proposer, and a Checker. The Solver generates an initial RAG response, which the Proposer decomposes into claim-level verifiable atomic propositions. Crucially, the Checker validates these propositions against retrieved evidence in isolation, deprived of the Solver's original output. This well-crafted information asymmetry scheme breaks the cycle of self-confirmation bias. By training this pipeline with multi-agent reinforcement learning (MARL), we enable the agents to co-evolve and optimize factual adherence. Extensive experiments across hallucination benchmarks demonstrate that MARCH substantially reduces hallucination rates. Notably, an 8B-parameter LLM equipped with MARCH achieves performance competitive with powerful closed-source models. MARCH paves a scalable path for factual self-improvement of LLMs through co-evolution. The code is at https://github.com/Qwen-Applications/MARCH.
翻译:幻觉问题仍是大型语言模型在现实应用(尤其是检索增强生成系统)中可靠性的关键瓶颈。现有幻觉检测方法通过"大语言模型作为裁判"机制验证输出与检索证据的一致性,但存在固有权衡偏差——验证器会不自觉地重现原始生成的错误。针对此问题,我们提出多智能体强化自检框架MARCH,通过刻意设计的信息非对称性实现严格的事实对齐。MARCH构建了由三个专用智能体组成的协作流水线:求解器、提议器与检查器。求解器生成初始检索增强生成响应,提议器将其分解为可验证的原子声明。关键创新在于,检查器在隔离条件下(屏蔽求解器原始输出)基于检索证据验证这些声明,这种精心设计的信息非对称机制打破了自我确认偏差的恶性循环。通过多智能体强化学习训练该流水线,使各智能体实现协同进化并优化事实遵从性。在多个幻觉基准上的实验表明,MARCH显著降低了幻觉率。值得注意的是,配备MARCH的80亿参数模型达到了与强大闭源模型相当的性能。MARCH通过协同进化为大语言模型事实性自我提升开辟了可扩展路径。代码已开源至https://github.com/Qwen-Applications/MARCH。