LegalMALR:Multi-Agent Query Understanding and LLM-Based Reranking for Chinese Statute Retrieval

Statute retrieval is essential for legal assistance and judicial decision support, yet real-world legal queries are often implicit, multi-issue, and expressed in colloquial or underspecified forms. These characteristics make it difficult for conventional retrieval-augmented generation pipelines to recover the statutory elements required for accurate retrieval. Dense retrievers focus primarily on the literal surface form of the query, whereas lightweight rerankers lack the legal-reasoning capacity needed to assess statutory applicability. We present LegalMALR, a retrieval framework that integrates a Multi-Agent Query Understanding System (MAS) with a zero-shot large-language-model-based reranking module (LLM Reranker). MAS generates diverse, legally grounded reformulations and conducts iterative dense retrieval to broaden candidate coverage. To stabilise the stochastic behaviour of LLM-generated rewrites, we optimise a unified MAS policy using Generalized Reinforcement Policy Optimization(GRPO). The accumulated candidate set is subsequently evaluated by the LLM Reranker, which performs natural-language legal reasoning to produce the final ranking. We further construct CSAID, a dataset of 118 difficult Chinese legal queries annotated with multiple statutory labels, and evaluate LegalMALR on both CSAID and the public STARD benchmark. Experiments show that LegalMALR substantially outperforms strong Retrieval-augmented generation(RAG) baselines in both in-distribution and out-of-distribution settings, demonstrating the effectiveness of combining multi-perspective query interpretation, reinforcement-based policy optimisation, and large-model reranking for statute retrieval.

翻译：法条检索对于法律辅助与司法决策支持至关重要，然而现实中的法律查询往往具有隐含性、多议题性，并以口语化或欠规范的形式表达。这些特点使得传统的检索增强生成流程难以准确恢复检索所需的法定要素。稠密检索器主要关注查询的字面表层形式，而轻量级重排序器则缺乏评估法条适用性所需的法律推理能力。本文提出LegalMALR，一个将多智能体查询理解系统与基于大语言模型的零样本重排序模块相结合的检索框架。MAS生成多样化、基于法律依据的查询重构，并通过迭代稠密检索以扩大候选法条覆盖范围。为稳定大语言模型生成改写结果的随机性，我们使用广义强化策略优化方法对统一的MAS策略进行优化。累积的候选集随后由大语言模型重排序器进行评估，该模块执行自然语言法律推理以生成最终排序。我们进一步构建了CSAID数据集，包含118个标注了多法条标签的困难中文法律查询，并在CSAID与公开的STARD基准上评估LegalMALR。实验表明，无论是在分布内还是分布外场景下，LegalMALR均显著优于强检索增强生成基线，这验证了结合多视角查询解释、基于强化的策略优化以及大模型重排序对于法条检索的有效性。

相关内容

排序

关注 313

排序是计算机内经常进行的一种操作，其目的是将一组“无序”的记录序列调整为“有序”的记录序列。分内部排序和外部排序。若整个排序过程不需要访问外存便能完成，则称此类排序问题为内部排序。反之，若参加排序的记录数量很大，整个序列的排序过程不可能在内存中完成，则称此类排序问题为外部排序。内部排序的过程是一个逐步扩大记录的有序序列长度的过程。

法律领域中的大语言模型智能体：分类体系、应用场景与挑战

专知会员服务

17+阅读 · 1月14日

《缓解大语言模型（LLMs）幻觉：面向应用的检索增强生成（RAG）、推理与智能体系统综述》

专知会员服务

24+阅读 · 2025年10月29日

【SIGIR2024教程】基于大语言模型的信息检索代理

专知会员服务

39+阅读 · 2024年7月17日

【大模型+搜索】AI搜索行业深度：大模型催生搜索行业变革机遇，产品百花齐放效果几何

专知会员服务

37+阅读 · 2024年4月17日