While Large Reasoning Models (LRMs) have demonstrated exceptional logical capabilities in mathematical domains, their application to the legal field remains hindered by the strict requirements for procedural rigor and adherence to legal logic. Existing legal LLMs, which rely on "closed-loop reasoning" derived solely from internal parametric knowledge, frequently suffer from lack of self-awareness regarding their knowledge boundaries, leading to confident yet incorrect conclusions. To address this challenge, we present Legal Reasoning with Agentic Search (LRAS), the first framework designed to transition legal LLMs from static and parametric "closed-loop thinking" to dynamic and interactive "Active Inquiry". By integrating Introspective Imitation Learning and Difficulty-aware Reinforcement Learning, LRAS enables LRMs to identify knowledge boundaries and handle legal reasoning complexity. Empirical results demonstrate that LRAS outperforms state-of-the-art baselines by 8.2-32\%, with the most substantial gains observed in tasks requiring deep reasoning with reliable knowledge. We will release our data and models for further exploration soon.
翻译:尽管大型推理模型(LRMs)在数学领域展现出卓越的逻辑能力,但其在法律领域的应用仍受到程序严谨性与法律逻辑遵循的严格要求的阻碍。现有法律大语言模型(LLMs)依赖于仅从内部参数知识衍生的“闭环推理”,常常缺乏对自身知识边界的认知,导致产生自信但错误的结论。为解决这一挑战,我们提出了基于智能体搜索的法律推理(LRAS),这是首个旨在将法律LLMs从静态、参数化的“闭环思维”转变为动态、交互式“主动探究”的框架。通过整合内省模仿学习与难度感知强化学习,LRAS使LRMs能够识别知识边界并处理法律推理的复杂性。实证结果表明,LRAS在性能上超越了现有最先进的基线模型8.2%至32%,在需要基于可靠知识进行深度推理的任务中提升最为显著。我们将很快发布相关数据与模型以供进一步探索。