An Exploratory Study of ML Sketches and Visual Code Assistants

This paper explores the integration of Visual Code Assistants in Integrated Development Environments (IDEs). In Software Engineering, whiteboard sketching is often the initial step before coding, serving as a crucial collaboration tool for developers. Previous studies have investigated patterns in SE sketches and how they are used in practice, yet methods for directly using these sketches for code generation remain limited. The emergence of visually-equipped large language models presents an opportunity to bridge this gap, which is the focus of our research. In this paper, we built a first prototype of a Visual Code Assistant to get user feedback regarding in-IDE sketch-to-code tools. We conduct an experiment with 19 data scientists, most of whom regularly sketch as part of their job. We investigate developers' mental models by analyzing patterns commonly observed in their sketches when developing an ML workflow. Analysis indicates that diagrams were the preferred organizational component (52.6%), often accompanied by lists (42.1%) and numbered points (36.8%). Our tool converts their sketches into a Python notebook by querying an LLM. We use an LLM-as-judge setup to score the quality of the generated code, finding that even brief sketching can effectively generate useful code outlines. We also find a positive correlation between sketch time and the quality of the generated code. We conclude the study by conducting extensive interviews to assess the tool's usefulness, explore potential use cases, and understand developers' needs. As noted by participants, promising applications for these assistants include education, prototyping, and collaborative settings. Our findings signal promise for the next generation of Code Assistants to integrate visual information, both to improve code generation and to better leverage developers' existing sketching practices.

翻译：本文探讨了视觉代码助手在集成开发环境中的集成。在软件工程中，白板草图通常是编码前的初始步骤，作为开发人员重要的协作工具。先前研究已考察了软件工程草图中的模式及其实际应用方式，但直接利用这些草图进行代码生成的方法仍然有限。具备视觉能力的大型语言模型的出现为弥合这一差距提供了契机，这正是本研究的重点。本文构建了视觉代码助手的首个原型，以获取用户对IDE内草图转代码工具的反馈。我们与19位数据科学家进行了实验，其中大多数在日常工作中经常使用草图。通过分析开发者在构建机器学习工作流时草图中常见的模式，我们探究了其心智模型。分析表明，图表是最常用的组织组件（52.6%），通常辅以列表（42.1%）和编号要点（36.8%）。我们的工具通过查询大型语言模型将其草图转换为Python笔记本。采用LLM-as-judge评估框架对生成代码的质量进行评分，发现即使简短的草图也能有效生成有用的代码框架。我们还发现草图绘制时间与生成代码质量呈正相关。研究最后通过深度访谈评估工具实用性、探索潜在应用场景并理解开发者需求。如参与者所述，这类助手在教育、原型设计和协作场景中具有广阔应用前景。我们的研究结果表明，整合视觉信息的下一代代码助手在提升代码生成质量及更好利用开发者现有草图实践方面具有巨大潜力。