We examined the mechanisms underlying productivity and performance gains from AI agents using a large-scale experiment on Pairit, a platform we developed to study human-AI collaboration. We randomly assigned 2,234 participants to human-human and human-AI teams that produced 11,024 ads for a think tank. We evaluated the ads using independent human ratings and a field experiment on X which garnered ~5M impressions. We found human-AI teams produced 50% more ads per worker and higher text quality, while human-human teams produced higher image quality, suggesting a jagged frontier of AI agent capability. Human-AI teams also produced more homogeneous, or self-similar, outputs. The field experiment revealed higher text quality improved click-through rates and view-through duration, while higher image quality improved cost-per-click rates. We found three mechanisms explained these effects. First, human-AI collaboration was more task-oriented, with 25% more task-oriented messages and 18% fewer interpersonal messages. Second, human-AI collaboration displayed more delegation, as participants delegated 17% more work to AI agents than to human partners and performed 62% fewer direct text edits when working with AI. Third, recognition that the collaborator was an AI moderated these effects as participants who correctly identified they were working with AI were more task-oriented and more likely to delegate work. These mechanisms then explained performance as task-oriented communication improved ad quality, specifically when working with AI, while interpersonal communication reduced ad quality; delegation improved text quality but had no effect on image quality and was positively associated with diversity collapse, creating homogeneous outputs of higher average quality. The results suggest AI agents drive changes in productivity, performance, and output diversity by reshaping teamwork.
翻译:我们通过在Pairit平台上开展的一项大规模实验,探究了AI智能体提升生产力和绩效的内在机制。Pairit是我们为研究人机协作而开发的平台。我们将2,234名参与者随机分配至人-人团队和人-AI团队,这些团队为一家智库创作了11,024条广告。我们通过独立的人工评分以及在X平台上开展的一项现场实验(获得约500万次展示)对这些广告进行了评估。研究发现,人-AI团队的人均广告产出量高出50%,且文本质量更高;而人-人团队则产生更高质量的图像,这暗示了AI智能体能力存在不均衡的前沿。人-AI团队还产生了更同质化或自相似的输出。现场实验表明,更高的文本质量提升了点击率和浏览时长,而更高的图像质量则改善了每次点击成本。我们发现了三种机制可以解释这些效应。首先,人-AI协作更具任务导向性,任务导向型消息多出25%,人际互动消息减少18%。其次,人-AI协作表现出更强的任务委派倾向,参与者委派给AI智能体的工作量比委派给人类伙伴的多17%,且在与AI协作时直接文本编辑量减少62%。第三,对协作对象为AI的认知调节了这些效应,因为正确识别出自己在与AI协作的参与者更具任务导向性,且更倾向于委派工作。这些机制进而解释了绩效差异:任务导向型沟通提升了广告质量(在与AI协作时尤为明显),而人际沟通则降低了广告质量;任务委派提高了文本质量,但对图像质量无影响,且与多样性坍缩正相关,从而产生了平均质量更高但同质化的输出。结果表明,AI智能体通过重塑团队合作,驱动了生产力、绩效和输出多样性的变革。