Legal case retrieval remains challenging due to the complexity of legal language and the need for precise lexical alignment between queries and relevant cases. Although dense retrieval models have achieved notable progress, empirical studies show that BM25 continues to serve as a strong baseline in this domain. It motivates us to propose a self-evolving framework for rule-driven query rewriting that enhances BM25 without any parameter training. The framework equips an LLM-based agent with an automatic evaluation environment, enabling it to iteratively create rewriting rules, plan validation experiments over rule combinations, and eliminate ineffective rules based on historical feedbacks. We evaluate our method on the Chinese legal case retrieval benchmark LeCaRD-v2. Experimental results demonstrate that the proposed framework outperforms non-evolutionary baselines, including human-designed rules and greedy rule selection, particularly when powered by a highcapacity core LLM. We also conduct detailed analyses to investigate the mechanisms underlying self-evolution. Our findings reveal that LLM's capabilities to leverage previous experimental results and its intrinsic knowledge of rule elimination play critical roles in refining the rule set via self-evolution.
翻译:法律案例检索因法律语言的复杂性和查询与相关案例间需要精确的词汇对齐而仍然具有挑战性。尽管密集检索模型已取得显著进展,但实证研究表明,BM25在该领域仍是一个强大的基线方法。这促使我们提出一个用于规则驱动查询重写的自进化框架,该框架无需任何参数训练即可增强BM25。该框架为基于大语言模型(LLM)的智能体配备了一个自动评估环境,使其能够迭代地创建重写规则、设计针对规则组合的验证实验,并根据历史反馈消除无效规则。我们在中文法律案例检索基准LeCaRD-v2上评估了我们的方法。实验结果表明,所提出的框架优于非进化基线方法(包括人工设计的规则和贪婪规则选择),尤其是在采用高容量核心LLM时表现更为突出。我们还进行了详细分析以探究自进化的内在机制。研究结果表明,LLM利用先前实验结果的能力及其内在的规则消除知识,在通过自进化优化规则集的过程中发挥着关键作用。