Before implementing a function, programmers are encouraged to write a purpose statement i.e., a short, natural-language explanation of what the function computes. A purpose statement may be ambiguous i.e., it may fail to specify the intended behaviour when two or more inequivalent computations are plausible on certain inputs. Our paper makes four contributions. First, we propose a novel heuristic that suggests such inputs using Large Language Models (LLMs). Using these suggestions, the programmer may choose to clarify the purpose statement (e.g., by providing a functional example that specifies the intended behaviour on such an input). Second, to assess the quality of inputs suggested by our heuristic, and to facilitate future research, we create an open dataset of purpose statements with known ambiguities. Third, we compare our heuristic against GitHub Copilot's Chat feature, which can suggest similar inputs when prompted to generate unit tests. Fourth, we provide an open-source implementation of our heuristic as an extension to Visual Studio Code for the Python programming language, where purpose statements and functional examples are specified as docstrings and doctests respectively. We believe that this tool will be particularly helpful to novice programmers and instructors.
翻译:在实现函数之前,鼓励程序员编写目的语句,即对函数计算内容进行简短的自然语言解释。目的语句可能存在歧义,即当某些输入下存在两种或多种不等价的计算都看似合理时,该语句可能无法明确说明预期行为。本文做出四项贡献。首先,我们提出一种新颖的启发式方法,利用大型语言模型(LLM)建议这样的输入。通过使用这些建议,程序员可以选择澄清目的语句(例如,通过提供一个功能示例来指定对这类输入的预期行为)。其次,为了评估我们启发式方法所建议输入的质量,并促进未来研究,我们创建了一个带有已知歧义的目的语句开放数据集。第三,我们将我们的启发式方法与GitHub Copilot的聊天功能进行比较,后者在被提示生成单元测试时也能建议类似的输入。第四,我们为Python编程语言提供了一个开源的启发式方法实现,作为Visual Studio Code的扩展,其中目的语句和功能示例分别以文档字符串(docstring)和文档测试(doctest)的形式指定。我们相信,该工具将对新手程序员和教师特别有帮助。