Automating repository-level software engineering tasks is a foundational challenge for autonomous code agents, largely due to the difficulty of configuring executable environments. However, manual configuration remains a labor-intensive bottleneck, necessitating a transition toward fully automated environment configuration. Existing approaches often rely on pre-defined artifacts or are restricted to specific programming languages, limiting their applicability to diverse real-world repositories. In this paper, we first propose RAT (RunAnyThing), a modular and extensible agent framework for fully automated configuration across programming languages on arbitrary repositories. RAT adopts a multi-stage pipeline that integrates language-aware abstraction, image initialization, specialized configuration toolset, and robust sandbox. Furthermore, to enable rigorous evaluation, we propose RATBench, a benchmark reflects the comprehensive coverage of real-world repositories. Extensive experiments demonstrate that RAT achieves state-of-the-art performance, improving Environment Setup Success Rate (ESSR) by an average of 36.1% over strong baselines.
翻译:自动化仓库级软件工程任务对于自主代码智能体而言是一项基础性挑战,这主要源于可执行环境配置的困难性。然而,人工配置仍是一个劳动密集型瓶颈,亟需向全自动化环境配置转型。现有方法通常依赖预定义构件或局限于特定编程语言,限制了对多样化真实仓库的适用性。本文首次提出RAT(RunAnyThing)——一个模块化可扩展的智能体框架,支持对任意仓库跨编程语言的完全自动化配置。RAT采用多阶段流水线,集成语言感知抽象、镜像初始化、专用配置工具集及稳健沙盒。此外,为支持严格评估,我们提出RATBench基准测试,该基准真实反映了仓库的全覆盖性。大量实验表明,RAT实现了最先进的性能,在强基线方法基础上将环境配置成功率(ESSR)平均提升36.1%。