Legal reasoning requires not only correct outcomes but also procedurally compliant reasoning processes. However, existing methods lack mechanisms to verify intermediate reasoning steps, allowing errors such as inapplicable statute citations to propagate undetected through the reasoning chain. To address this, we propose LawThinker, an autonomous legal research agent that adopts an Explore-Verify-Memorize strategy for dynamic judicial environments. The core idea is to enforce verification as an atomic operation after every knowledge exploration step. A DeepVerifier module examines each retrieval result along three dimensions of knowledge accuracy, fact-law relevance, and procedural compliance, with a memory module for cross-round knowledge reuse in long-horizon tasks. Experiments on the dynamic benchmark J1-EVAL show that LawThinker achieves a 24% improvement over direct reasoning and an 11% gain over workflow-based methods, with particularly strong improvements on process-oriented metrics. Evaluations on three static benchmarks further confirm its generalization capability. The code is available at https://github.com/yxy-919/LawThinker-agent .
翻译:法律推理不仅要求得出正确结论,还需遵循合规的推理流程。然而,现有方法缺乏对中间推理步骤的验证机制,导致诸如法条引用不当等错误可能在推理链中未被察觉地传播。为解决这一问题,我们提出LawThinker——一种采用"探索-验证-记忆"策略的自主法律研究智能体,适用于动态司法环境。其核心思想是在每个知识探索步骤后强制执行原子化的验证操作。DeepVerifier模块从知识准确性、事实-法律关联性以及程序合规性三个维度审查每个检索结果,并配备记忆模块以实现长周期任务中的跨轮次知识复用。在动态基准测试集J1-EVAL上的实验表明,LawThinker相比直接推理方法提升24%,较基于工作流的方法提升11%,且在流程导向的评估指标上表现尤为突出。在三个静态基准测试集上的评估进一步证实了其泛化能力。代码已开源:https://github.com/yxy-919/LawThinker-agent。