以少求多：高效编码智能体轮次控制策略的实证研究 (More with Less: An Empirical Study of Turn-Control Strategies for Efficient Coding Agents)

LLM-powered coding agents, which operate in iterative loops (turns) to solve software engineering tasks, are becoming increasingly powerful. However, their practical deployment is hindered by significant and unpredictable costs. This challenge arises from a combination of factors: quadratically growing token counts with each turn, the high price of models, the large number of turns required for real-world tasks, and the tendency of agents to take inefficient or unnecessary actions. While existing research focuses on optimizing individual turns, the strategic control of the total number of turns remains an underexplored area for managing agent performance and cost. To address this gap, we conduct a comprehensive empirical study on SWE-bench using three state-of-the-art models and evaluate the impact of three distinct turn-control strategies: an unrestricted baseline, a fixed-turn limit with reminders, and a novel dynamic-turn strategy that grants extensions on-demand. Our findings first reveal a fundamental trade-off in the unrestricted setting, where no single model excels across performance, cost, and turn efficiency. We then show that a fixed-turn limit, specifically at the 75th percentile of the baseline, serves as a "sweet spot", substantially reducing costs (by 24%-68%) with minimal impact on solve rates. Most significantly, the dynamic-turn strategy consistently outperforms fixed-limit approaches, achieving comparable or better solve rates while further reducing costs by an additional 12%-24% by intelligently allocating resources only to tasks that need them. This work provides the first systematic analysis of turn-control strategies, offering simple yet effective guidelines for developers to balance cost and efficacy. We demonstrate that dynamic resource allocation is a superior, easy-to-implement approach for deploying powerful yet economically viable coding agents.

翻译：基于大语言模型的编码智能体通过迭代循环（轮次）解决软件工程任务，正变得日益强大。然而，其实际部署受到显著且不可预测成本的阻碍。这一挑战源于多种因素的综合作用：每轮次令牌数量的二次增长、模型的高昂价格、实际任务所需的大量轮次，以及智能体倾向于采取低效或不必要行动的倾向。现有研究主要集中于优化单个轮次，而对总轮次数量的策略性控制作为管理智能体性能与成本的一个领域，仍未得到充分探索。为填补这一空白，我们在SWE-bench上使用三种最先进的模型进行了全面的实证研究，评估了三种不同轮次控制策略的影响：无限制基线、带提醒的固定轮次限制，以及一种新颖的动态轮次策略（可按需授予扩展）。我们的研究结果首先揭示了无限制设置中的一个基本权衡：没有任何单一模型能在性能、成本和轮次效率方面均表现出色。随后我们发现，固定轮次限制（具体设定在基线第75百分位数处）是一个“最佳平衡点”，能在对解决率影响最小的情况下大幅降低成本（降低24%-68%）。最重要的是，动态轮次策略始终优于固定限制方法，通过智能地将资源仅分配给需要它们的任务，在达到相当或更好解决率的同时，进一步将成本额外降低了12%-24%。这项工作首次对轮次控制策略进行了系统分析，为开发者平衡成本与效能提供了简单而有效的指导。我们证明，动态资源分配是一种优越且易于实施的策略，可用于部署强大且经济可行的编码智能体。