Automatically generating source code from natural language descriptions has been a growing field of research in recent years. However, current large-scale code generation models often encounter difficulties when selecting appropriate APIs for specific contexts. These models may generate APIs that do not meet requirements or refer to non-existent APIs in third-party libraries, especially for lesser-known or private libraries. Inspired by the process of human developers using tools to search APIs, we propose ToolCoder, a novel approach that integrates API search tools with existing models to assist in code generation and API selection. To teach our model to use tools, we introduce an automated data annotation method using ChatGPT to add tool usage information into the source code data and fine-tune code generation models. During inference, we integrate API search tools into the generation process so that our model can automatically use the search tool to get suggestions when selecting an API. Our experimental results demonstrate that ToolCoder exhibits excellent performance and generalization across five public and private library code generation benchmarks, with at least 6.21\% improvement on average pass@1 metrics and 9.64\% improvement on average pass@10 metrics compared to state-of-the-art methods. Furthermore, we show that our relatively small ToolCoder model is comparable to one of the current best models, GPT-3.5, highlighting the potential of incorporating programming tools into the code generation process.
翻译:近年来,从自然语言描述中自动生成源代码已成为一个不断发展的研究领域。然而,当前大规模代码生成模型在为特定上下文选择合适API时常常遇到困难。这些模型可能生成不符合需求的API,或引用第三方库中不存在的API,尤其是对于较不为人知或私有库而言。受人类开发者使用工具搜索API过程的启发,我们提出了ToolCoder——一种将API搜索工具与现有模型相结合的新方法,以辅助代码生成和API选择。为教会模型使用工具,我们引入了一种自动化数据标注方法,利用ChatGPT向源代码数据添加工具使用信息,并微调代码生成模型。在推理过程中,我们将API搜索工具集成到生成流程中,使模型在选定API时能够自动使用搜索工具获取建议。实验结果表明,ToolCoder在五个公共和私有库代码生成基准测试中展现出卓越的性能和泛化能力,与最先进方法相比,平均pass@1指标提升至少6.21%,平均pass@10指标提升至少9.64%。此外,我们展示了相对较小的ToolCoder模型可与当前最佳模型之一GPT-3.5相媲美,凸显了将编程工具融入代码生成过程的潜力。