Large language models (LLMs) have exhibited a strong promise in automatically generating executable code from natural language descriptions, particularly with interactive features that allow users to engage in the code-generation process by instructing the LLM with iterative feedback. However, existing interaction paradigms often assume that users have expert knowledge to debug source code and are not optimized for non-professional programmers' use. This raises challenges in making interactive code generation more accessible for individuals with varying levels of programming expertise. To tackle these challenges, we present IntelliExplain, which offers a novel human-LLM interaction paradigm to enhance non-professional programmers' experience by enabling them to interact with source code via natural language explanations. Users interact with IntelliExplain by providing natural language corrective feedback on errors they identify from the explanations. Feedback is used by the system to revise the code, until the user is satisfied with explanations by the system of the code. Our user study demonstrates that users with IntelliExplain achieve a significantly higher success rate 11.6% and 25.3% better than with vanilla GPT-3.5, while also requiring 39.0% and 15.6% less time in Text-to-SQL and Python code generation tasks, respectively.
翻译:大型语言模型(LLMs)在根据自然语言描述自动生成可执行代码方面展现出强大潜力,尤其是通过交互式功能允许用户以迭代反馈指导代码生成过程。然而,现有交互范式往往假设用户具备调试源代码的专业知识,未针对非专业程序员的使用场景优化。这给不同编程水平用户更便捷地使用交互式代码生成带来了挑战。为解决这些问题,我们提出IntelliExplain,其通过新颖的人机交互范式,让用户借助自然语言解释与源代码交互,从而提升非专业程序员的使用体验。用户通过识别解释中的错误并提供自然语言纠正反馈与IntelliExplain交互。系统利用这些反馈修正代码,直至用户对系统的代码解释满意。用户研究表明,使用IntelliExplain的用户在Text-to-SQL和Python代码生成任务中,成功率分别比原生GPT-3.5显著提高11.6%和25.3%,耗时分别减少39.0%和15.6%。