Although recent developments in generative AI have greatly enhanced the capabilities of conversational agents such as Google's Bard or OpenAI's ChatGPT, it's unclear whether the usage of these agents aids users across various contexts. To better understand how access to conversational AI affects productivity and trust, we conducted a mixed-methods, task-based user study, observing 76 software engineers (N=76) as they completed a programming exam with and without access to Bard. Effects on performance, efficiency, satisfaction, and trust vary depending on user expertise, question type (open-ended "solve" questions vs. definitive "search" questions), and measurement type (demonstrated vs. self-reported). Our findings include evidence of automation complacency, increased reliance on the AI over the course of the task, and increased performance for novices on "solve"-type questions when using the AI. We discuss common behaviors, design recommendations, and impact considerations to improve collaborations with conversational AI.
翻译:尽管生成式AI的最新进展极大提升了诸如Google的Bard或OpenAI的ChatGPT等对话式智能体的能力,但尚不清楚这些智能体的使用是否能在不同场景中为用户提供帮助。为深入理解对话式AI的访问如何影响生产力与信任,我们采用混合方法开展了一项基于任务的用户研究,观察了76名软件工程师(N=76)在有无Bard辅助情况下完成编程测试的过程。对绩效、效率、满意度和信任的影响因用户专业知识水平、问题类型(开放式“求解”问题 vs. 确定性“搜索”问题)以及测量类型(实际演示 vs. 自我报告)而异。研究结果包括:自动化自满的证据、任务过程中对AI依赖性的增强,以及使用AI时新手在“求解”类问题上的绩效提升。我们讨论了常见行为、设计建议及影响考量,以改进与对话式AI的协作。