We propose SWE-Universe, a scalable and efficient framework for automatically constructing real-world software engineering (SWE) verifiable environments from GitHub pull requests (PRs). To overcome the prevalent challenges of automatic building, such as low production yield, weak verifiers, and prohibitive cost, our framework utilizes a building agent powered by an efficient custom-trained model. This agent employs iterative self-verification and in-loop hacking detection to ensure the reliable generation of high-fidelity, verifiable tasks. Using this method, we scale the number of real-world multilingual SWE environments to a million scale (807,693). We demonstrate the profound value of our environments through large-scale agentic mid-training and reinforcement learning. Finally, we applied this technique to Qwen3-Max-Thinking and achieved a score of 75.3% on SWE-Bench Verified. Our work provides both a critical resource and a robust methodology to advance the next generation of coding agents.
翻译:我们提出SWE-Universe,一个可扩展且高效的框架,用于从GitHub拉取请求(PR)中自动构建真实世界软件工程(SWE)可验证环境。为克服自动构建中普遍存在的生产产出率低、验证器弱及成本高昂等挑战,本框架采用由高效定制训练模型驱动的构建智能体。该智能体通过迭代式自我验证与循环内黑客检测,确保可靠生成高保真度的可验证任务。利用此方法,我们将真实世界多语言SWE环境的数量扩展至百万级别(807,693个)。我们通过大规模智能体中期训练与强化学习,证明了所构建环境的深远价值。最终,我们将此技术应用于Qwen3-Max-Thinking模型,并在SWE-Bench Verified基准测试中取得了75.3%的得分。本工作为推进下一代编码智能体的发展提供了关键资源与稳健方法论。