Advances in AI agent capabilities have outpaced users' ability to meaningfully oversee their execution. AI agents can perform sophisticated, multi-step knowledge work autonomously from start to finish, yet this process remains effectively inaccessible during execution, often buried within large volumes of intermediate reasoning and outputs: by the time users receive the output, all underlying decisions have already been made without their involvement. This lack of transparency leaves users unable to examine the agent's assumptions, identify errors before they propagate, or redirect execution when it deviates from their intent. The stakes are particularly high in spreadsheet environments, where process and artifact are inseparable. Each decision the agent makes is recorded directly in cells that belong to and reflect on the user. We introduce Pista, a spreadsheet AI agent that decomposes execution into auditable, controllable actions, providing users with visibility into the agent's decision-making process and the capacity to intervene at each step. A formative study (N = 8) and a within-subjects summative evaluation (N = 16) comparing Pista to a baseline agent demonstrated that active participation in execution influenced not only task outcomes but also users' comprehension of the task, their perception of the agent, and their sense of role within the workflow. Users identified their own intent reflected in the agent's actions, detected errors that post-hoc review would have failed to surface, and reported a sense of co-ownership over the resulting output. These findings indicate that meaningful human oversight of AI agents in knowledge work requires not improved post-hoc review mechanisms, but active participation in decisions as they are made.
翻译:人工智能代理能力的进步已超出用户有意义地监督其执行过程的能力。AI代理能够自主完成从始至终复杂多步骤的知识工作,但这一过程在执行期间实际上仍无法被有效触及——往往淹没在大量中间推理和输出中:当用户收到最终输出时,所有底层决策已在不经用户参与的情况下完成。这种透明度的缺失导致用户无法检查代理的假设前提、在错误传播前予以识别,也无法在代理偏离用户意图时重新调整执行方向。在电子表格环境中,这种风险尤为突出,因为流程与工件密不可分:代理的每项决策都直接记录在属于用户且反映用户身份的单元格中。我们提出Pista——一种将执行过程分解为可审计、可控制操作的电子表格AI代理,使用户能够洞察代理的决策过程,并在每个步骤具备干预能力。通过形成性研究(N=8)与受试者内总结性评估(N=16),将Pista与基线代理进行比较,结果表明:主动参与执行不仅影响任务成果,还影响用户对任务的理解、对代理的感知以及在流程中的角色认知。用户能够在代理行为中识别自身意图,发现事后审查无法暴露的错误,并对最终输出产生共同所有权感。这些发现表明,在知识工作中实现对AI代理有意义的人类监督,需要的不是改进事后审查机制,而是在决策制定过程中主动参与。