The deployment of coding agents in privacy-sensitive and resource-constrained environments drives the demand for capable open-weight Small Language Models (SLMs). However, they suffer from a fundamental capability gap: unlike frontier large models, they lack the inference-time strong generalization to work with complicated, unfamiliar codebases. We identify that the prevailing Task-Centric Learning (TCL) paradigm, which scales exposure across disparate repositories, fails to address this limitation. In response, we propose Repository-Centric Learning (RCL), a paradigm shift that prioritizes vertical repository depth over horizontal task breadth, suggesting SLMs must internalize the "physics" of a target software environment through parametric knowledge acquisition, rather than attempting to recover it via costly inference-time search. Following this new paradigm, we design a four-unit Repository-Centric Experience, transforming static codebases into interactive learning signals, to train SWE-Spot-4B, a family of highly compact models built as repo-specialized experts that breaks established scaling trends, outperforming open-weight models up to larger (e.g., CWM by Meta, Qwen3-Coder-30B) and surpassing/matching efficiency-focused commercial models (e.g., GPT-4.1-mini, GPT-5-nano) across multiple SWE tasks. Further analysis reveals that RCL yields higher training sample efficiency and lower inference costs, emphasizing that for building efficient intelligence, repository mastery is a distinct and necessary dimension that complements general coding capability.
翻译:在隐私敏感和资源受限的环境中部署编码智能体,推动了对高性能开源小参数语言模型(SLMs)的需求。然而,这些模型存在一个根本性的能力缺陷:与前沿大模型不同,它们缺乏在推理时对复杂、陌生代码库进行强泛化的能力。我们发现,当前主流的任务中心化学习(TCL)范式,即通过广泛接触不同仓库来扩展经验,无法解决这一局限。为此,我们提出了仓库中心化学习(RCL)这一范式转变,它优先考虑对单个仓库的纵向深度探索,而非横向任务广度。该范式认为,小参数语言模型必须通过参数化知识获取来内化目标软件环境的“物理规律”,而非试图通过昂贵的推理时搜索来恢复这些信息。遵循这一新范式,我们设计了一个包含四个单元的仓库中心化经验模块,将静态代码库转化为交互式学习信号,并以此训练了SWE-Spot-4B模型系列。该系列是高度紧凑的仓库专业化专家模型,打破了既有的缩放规律,在多项软件工程任务上,其性能超越了参数规模更大的开源模型(例如Meta的CWM、Qwen3-Coder-30B),并达到或超越了注重效率的商业模型(例如GPT-4.1-mini、GPT-5-nano)。进一步分析表明,RCL范式带来了更高的训练样本效率和更低的推理成本,这强调了一个观点:对于构建高效智能体而言,精通特定仓库是一个独特且必要的维度,它与通用编码能力形成互补。