Statute retrieval is essential for legal assistance and judicial decision support, yet real-world legal queries are often implicit, multi-issue, and expressed in colloquial or underspecified forms. These characteristics make it difficult for conventional retrieval-augmented generation pipelines to recover the statutory elements required for accurate retrieval. Dense retrievers focus primarily on the literal surface form of the query, whereas lightweight rerankers lack the legal-reasoning capacity needed to assess statutory applicability. We present LegalMALR, a retrieval framework that integrates a Multi-Agent Query Understanding System (MAS) with a zero-shot large-language-model-based reranking module (LLM Reranker). MAS generates diverse, legally grounded reformulations and conducts iterative dense retrieval to broaden candidate coverage. To stabilise the stochastic behaviour of LLM-generated rewrites, we optimise a unified MAS policy using Generalized Reinforcement Policy Optimization(GRPO). The accumulated candidate set is subsequently evaluated by the LLM Reranker, which performs natural-language legal reasoning to produce the final ranking. We further construct CSAID, a dataset of 118 difficult Chinese legal queries annotated with multiple statutory labels, and evaluate LegalMALR on both CSAID and the public STARD benchmark. Experiments show that LegalMALR substantially outperforms strong Retrieval-augmented generation(RAG) baselines in both in-distribution and out-of-distribution settings, demonstrating the effectiveness of combining multi-perspective query interpretation, reinforcement-based policy optimisation, and large-model reranking for statute retrieval.
翻译:法条检索对于法律辅助与司法决策支持至关重要,然而现实中的法律查询往往具有隐含性、多议题性,并以口语化或欠规范的形式表达。这些特点使得传统的检索增强生成流程难以准确恢复检索所需的法定要素。稠密检索器主要关注查询的字面表层形式,而轻量级重排序器则缺乏评估法条适用性所需的法律推理能力。本文提出LegalMALR,一个将多智能体查询理解系统与基于大语言模型的零样本重排序模块相结合的检索框架。MAS生成多样化、基于法律依据的查询重构,并通过迭代稠密检索以扩大候选法条覆盖范围。为稳定大语言模型生成改写结果的随机性,我们使用广义强化策略优化方法对统一的MAS策略进行优化。累积的候选集随后由大语言模型重排序器进行评估,该模块执行自然语言法律推理以生成最终排序。我们进一步构建了CSAID数据集,包含118个标注了多法条标签的困难中文法律查询,并在CSAID与公开的STARD基准上评估LegalMALR。实验表明,无论是在分布内还是分布外场景下,LegalMALR均显著优于强检索增强生成基线,这验证了结合多视角查询解释、基于强化的策略优化以及大模型重排序对于法条检索的有效性。