Automated code generation can be a powerful technique for software development, significantly reducing developers' efforts and time required to create new code by generating it automatically based on requirements. Recently, OpenAI's language model ChatGPT has emerged as a powerful tool for generating human-like responses to a wide range of textual inputs (i.e., prompts), including those related to code generation. However, the effectiveness of ChatGPT for code generation is not well understood, and the generation performance could be heavily influenced by the choice of prompt. To answer these questions, we conducted experiments using the CodeXGlue dataset to evaluate ChatGPT's capabilities for two code generation tasks, including text-to-code and code-to-code generation. We designed prompts by leveraging the chain-of-thought strategy with multi-step optimizations. Our results showed that by carefully designing prompts to guide ChatGPT, the generation performance can be improved substantially. We also analyzed the factors that influenced the prompt design and provided insights that could guide future research.
翻译:自动代码生成可以是一种强大的软件开发技术,通过根据需求自动生成代码,显著减少开发人员创建新代码所需的工作量和时间。最近,OpenAI的语言模型ChatGPT作为一种强大的工具出现,能够对各种文本输入(即提示词)生成类似人类的响应,包括与代码生成相关的输入。然而,ChatGPT在代码生成方面的有效性尚未得到充分理解,且生成性能可能受到提示词选择的显著影响。为了回答这些问题,我们使用CodeXGlue数据集进行了实验,评估ChatGPT在两项代码生成任务(包括文本到代码和代码到代码生成)中的能力。我们通过利用链式思维策略结合多步优化设计了提示词。结果表明,通过精心设计提示词来引导ChatGPT,生成性能可以显著提高。我们还分析了影响提示词设计的因素,并提供了可指导未来研究的见解。