LLM-powered tools like ChatGPT Data Analysis, have the potential to help users tackle the challenging task of data analysis programming, which requires expertise in data processing, programming, and statistics. However, our formative study (n=15) uncovered serious challenges in verifying AI-generated results and steering the AI (i.e., guiding the AI system to produce the desired output). We developed two contrasting approaches to address these challenges. The first (Stepwise) decomposes the problem into step-by-step subgoals with pairs of editable assumptions and code until task completion, while the second (Phasewise) decomposes the entire problem into three editable, logical phases: structured input/output assumptions, execution plan, and code. A controlled, within-subjects experiment (n=18) compared these systems against a conversational baseline. Users reported significantly greater control with the Stepwise and Phasewise systems, and found intervention, correction, and verification easier, compared to the baseline. The results suggest design guidelines and trade-offs for AI-assisted data analysis tools.
翻译:以ChatGPT数据分析为代表的LLM驱动工具有望帮助用户应对数据分析编程这一挑战性任务,该任务需要数据处理、编程和统计学方面的专业知识。然而,我们的形成性研究(n=15)揭示了在验证AI生成结果和引导AI(即指导AI系统产生期望输出)方面存在的严峻挑战。为应对这些挑战,我们开发了两种对比性方法。第一种方法(Stepwise)将问题分解为逐步实现的子目标,每个子目标包含可编辑的假设与代码对,直至任务完成;第二种方法(Phasewise)则将整个问题分解为三个可编辑的逻辑阶段:结构化输入/输出假设、执行计划和代码。通过一项受控的组内实验(n=18),我们将这些系统与对话式基线系统进行了比较。用户报告称,与基线系统相比,Stepwise和Phasewise系统能提供显著更强的控制感,且干预、修正和验证过程更为简便。研究结果为AI辅助数据分析工具的设计提供了指导原则与权衡考量。