Code generation from text requires understanding the user's intent from a natural language description and generating an executable code snippet that satisfies this intent. While recent pretrained language models demonstrate remarkable performance for this task, these models fail when the given natural language description is under-specified. In this work, we introduce a novel and more realistic setup for this task. We hypothesize that the under-specification of a natural language description can be resolved by asking clarification questions. Therefore, we collect and introduce a new dataset named CodeClarQA containing pairs of natural language descriptions and code with created synthetic clarification questions and answers. The empirical results of our evaluation of pretrained language model performance on code generation show that clarifications result in more precisely generated code, as shown by the substantial improvement of model performance in all evaluation metrics. Alongside this, our task and dataset introduce new challenges to the community, including when and what clarification questions should be asked. Our code and dataset are available on GitHub.
翻译:从文本生成代码需要理解自然语言描述中的用户意图,并生成满足该意图的可执行代码片段。尽管近年来预训练语言模型在此任务中表现出色,但若给定的自然语言描述存在歧义或信息不足,这些模型会失效。本研究为该任务提出了一种新颖且更贴近实际的设置。我们假设自然语言描述的歧义性问题可通过提问澄清问题来解决。为此,我们收集并创建了一个名为CodeClarQA的新数据集,其中包含自然语言描述与代码的配对,并附有合成的澄清问题及对应的回答。对预训练语言模型在代码生成任务上的评估实验结果表明,澄清问题能显著提升生成代码的精确性:所有评估指标均显示模型性能大幅提升。此外,本任务及数据集为学术界引入了包括"何时提出澄清问题"以及"应提出何种澄清问题"在内的新挑战。我们的代码与数据集已开源至GitHub平台。