Automatically generating source code from natural language descriptions has been a growing field of research in recent years. However, current large-scale code generation models often encounter difficulties when selecting appropriate APIs for specific contexts. These models may generate APIs that do not meet requirements or refer to non-existent APIs in third-party libraries, especially for lesser-known or private libraries. Inspired by the process of human developers using tools to search APIs, we propose ToolCoder, a novel approach that integrates API search tools with existing models to assist in code generation and API selection. To teach our model to use tools, we introduce an automated data annotation method using ChatGPT to add tool usage information into the source code data and fine-tune code generation models. During inference, we integrate API search tools into the generation process so that our model can automatically use the search tool to get suggestions when selecting an API. Our experimental results demonstrate that ToolCoder exhibits excellent performance and generalization across five public and private library code generation benchmarks, with at least 6.21\% improvement on average pass@1 metrics and 9.64\% improvement on average pass@5 metrics compared to state-of-the-art methods. Furthermore, we show that our relatively small ToolCoder model is comparable to one of the current best models, GPT-3.5, highlighting the potential of incorporating programming tools into the code generation process.
翻译:近年来,从自然语言描述自动生成源代码已成为一个不断发展的研究领域。然而,当前的大规模代码生成模型在针对特定上下文选择合适的API时经常遇到困难。这些模型可能生成不符合要求的API,或引用第三方库中不存在的API,尤其是对于知名度较低或私有库而言。受人类开发者使用工具搜索API过程的启发,我们提出ToolCoder——一种将API搜索工具与现有模型相结合以辅助代码生成和API选择的新方法。为教会模型使用工具,我们引入了一种自动数据标注方法,利用ChatGPT在源代码数据中添加工具使用信息,并对代码生成模型进行微调。在推理过程中,我们将API搜索工具集成到生成流程中,使得模型在选择API时能够自动使用搜索工具获取建议。实验结果表明,ToolCoder在五个公共和私有库代码生成基准测试中展现出卓越的性能和泛化能力,与最先进方法相比,平均pass@1指标提升至少6.21%,平均pass@5指标提升9.64%。此外,我们展示了相对较小的ToolCoder模型与当前最佳模型之一GPT-3.5性能相当,凸显了将编程工具融入代码生成过程的潜力。