Large language models (LLM) have proven to be effective at automated program repair (APR). However, using LLMs can be costly, with companies invoicing users by the number of tokens. In this paper, we propose CigaR, the first LLM-based APR tool that focuses on minimizing the repair cost. CigaR works in two major steps: generating a first plausible patch and multiplying plausible patches. CigaR optimizes the prompts and the prompt setting to maximize the information given to LLMs using the smallest possible number of tokens. Our experiments on 429 bugs from the widely used Defects4J and HumanEval-Java datasets shows that CigaR reduces the token cost by 73%. On average, CigaR spends 127k tokens per bug while the baseline uses 467k tokens per bug. On the subset of bugs that are fixed by both, CigaR spends 20k per bug while the baseline uses 608k tokens, a cost saving of 96%. Our extensive experiments show that CigaR is a cost-effective LLM-based program repair tool that uses a low number of tokens to automatically generate patches.
翻译:大语言模型(LLM)已被证明在自动程序修复(APR)领域效果显著。然而,使用LLM的成本高昂,公司通常按token数量向用户收费。本文提出CigaR——首个专注于最小化修复成本的LLM程序修复工具。CigaR通过两大步骤实现:生成首个可能补丁与倍增可能补丁。该工具优化提示词及提示设置,以最小化token数量向LLM传递最大信息量。我们在广泛使用的Defects4J和HumanEval-Java数据集的429个缺陷上的实验表明,CigaR将token成本降低73%。每个缺陷平均消耗127k token,而基线为467k token。在两者均能修复的缺陷子集中,CigaR每个缺陷仅需20k token,而基线消耗608k token,实现96%的成本节省。大量实验证明,CigaR是一种通过消耗少量token自动生成补丁的成本高效LLM程序修复工具。