When designing a program, both novice programmers and seasoned developers alike often sketch out -- or, perhaps more famously, whiteboard -- their ideas. Yet despite the introduction of natively multimodal Generative AI models, work on Human-GenAI collaborative coding has remained overwhelmingly focused on textual prompts -- largely ignoring the visual and spatial representations that programmers naturally use to reason about and communicate their designs. In this proposal and position paper, we argue and provide tentative evidence that this text-centric focus overlooks other forms of prompting GenAI models, such as problem decomposition diagrams functioning as prompts for code generation in their own right enabling new types of programming activities and assessments. To support this position, we present findings from a large introductory Python programming course, where students constructed decomposition diagrams that were used to prompt GPT-4.1 for code generation. We demonstrate that current models are very successful in their ability to generate code from student-constructed diagrams. We conclude by exploring the implications of embracing multimodal prompting for computing education, particularly in the context of assessment.
翻译:在设计程序时,无论是新手程序员还是经验丰富的开发者,常常会通过草图——或者更广为人知的白板绘图——来表达他们的想法。然而,尽管原生多模态生成式人工智能模型已经出现,关于人类与生成式人工智能协作编程的研究仍然过度集中于文本提示,很大程度上忽视了程序员在设计和交流中自然使用的视觉与空间表征方式。在这篇提案与立场论文中,我们提出并提供了初步证据,表明这种以文本为中心的视角忽略了其他形式的生成式人工智能模型提示方法,例如问题分解图本身即可作为代码生成的提示,从而支持新型编程活动与评估方式。为支持这一立场,我们展示了一项大型Python入门编程课程的研究结果,在该课程中,学生构建了用于提示GPT-4.1生成代码的分解图。我们证明,当前模型能够非常成功地从学生构建的图表中生成代码。最后,我们探讨了采用多模态提示对计算机教育,特别是在评估背景下的潜在影响。