Code agents must both reason over long-horizon repository state and obey strict tool-use protocols. In paired Instruct/Thinking checkpoints, these capabilities are complementary but misaligned. The Instruct model is concise and tool-disciplined, whereas the Thinking model offers stronger planning and recovery behavior but often over-deliberates and degrades agent performance. We present CRANE (Constrained Reasoning Injection for Code Agents via Nullspace Editing), a training-free parameter-editing method that treats the Thinking-Instruct delta as a directional pool of candidate reasoning edits for the Instruct backbone. CRANE combines magnitude thresholding to denoise the delta, a Conservative Taylor Gate to retain edits that are jointly beneficial for reasoning transfer and tool-use preservation, and Graduated Sigmoidal Projection to suppress format-critical update directions. By merging paired Instruct and Thinking checkpoints, CRANE delivers strong gains over either individual model while preserving Instruct-level efficiency: on Roo-Eval it achieves pass1 of 66.2% (+19.5%) for Qwen3-30B-A3B and 81.5% (+8.7%) for Qwen3-Next-80B-A3B; on SWE-bench-Verified it resolves up to 14 additional instances at both scales (122/500 and 180/500); and on Terminal-Bench v2 it improves pass1/pass5 by up to 2.3%/7.8%, reaching 7.6%/17.9% and 14.8%/30.3%, respectively, consistently outperforming alternative merging strategies across all three benchmarks.
翻译:代码智能体既要对长期仓库状态进行推理,又要严格遵守工具使用协议。在配对的指令/推理检查点中,这些能力虽互补却存在不一致性:指令模型简洁且遵循工具规范,而推理模型虽具备更强的规划与恢复能力,却常因过度推演而降低智能体性能。本文提出CRANE(通过零空间编辑为代码智能体注入约束推理)——一种免训练的参数量编辑方法,将推理-指令差值视为指令模型骨干的候选推理编辑方向池。CRANE融合了多重技术:通过幅值阈值对差值去噪,采用保守型泰勒门保留同时有益于推理迁移与工具保持的编辑方向,并借助渐进式S型投影抑制破坏格式的关键更新方向。通过融合配对的指令与推理检查点,CRANE在保持指令级效率的同时,相较任一单独模型均取得显著提升:在Roo-Eval上,Qwen3-30B-A3B的pass1达到66.2%(+19.5%),Qwen3-Next-80B-A3B达81.5%(+8.7%);在SWE-bench-Verified上,两种规模模型分别额外解决14个实例(122/500与180/500);在Terminal-Bench v2上,pass1/pass5提升最高达2.3%/7.8%,分别达到7.6%/17.9%与14.8%/30.3%,在全部三个基准测试中持续优于其他融合策略。