Can targeted user training unlock the productive potential of generative artificial intelligence in professional settings? We study this question using a randomized experiment in which 164 law students completed an issue-spotting examination under one of three conditions: no GenAI access, optional access to a large language model (LLM), or LLM access with a brief training intervention. Untrained LLM access proved counterproductive: relative to participants without any LLM access, untrained users wrote significantly shorter answers, committed more case misstatements, and scored marginally lower, though most differences fall short of conventional significance. Training reversed this pattern. Trained participants adopted the LLM at higher rates (41% vs. 26%; p = 0.044), scored 0.27 grade points higher than untrained users--roughly one fine grade--(p = 0.027), and stated applicable rules more accurately (p = 0.014). Principal stratification analysis suggests training operates primarily through adoption rather than effectiveness--the adoption lower bound (1.06) exceeds the effectiveness upper bound (0.42) at strict mean dominance--though confidence intervals are wide. More broadly, these findings challenge the view that GenAI primarily benefits lower-skilled workers: without training, higher-ability practitioners opt out while lower-ability users adopt but unproductively. Realizing GenAI's productivity gains requires investment in both access and instruction.
翻译:定向用户培训能否释放生成式人工智能在专业场景中的生产潜力?我们通过一项随机实验研究这一问题:164名法律学生在三种条件下完成法律问题识别考试——无生成式人工智能(GenAI)访问权限、可选访问大语言模型(LLM),或接受简短培训干预后访问LLM。未经培训的LLM访问适得其反:相比无LLM访问的参与者,未培训用户撰写的答案显著更短,出现更多案例表述错误,且得分略低,尽管多数差异未达常规显著性水平。培训扭转了这一模式:受训参与者以更高比率采纳LLM(41%对比26%;p=0.044),得分比未培训用户高0.27个绩点——约相当于一个精细等级——(p=0.027),且更准确地陈述了适用规则(p=0.014)。主分层分析表明,培训主要通过采纳率而非效能发挥作用——在严格均值主导条件下,采纳下限(1.06)超过效能上限(0.42),尽管置信区间较宽。更广泛而言,这些发现挑战了GenAI主要惠及低技能工作者的观点:若无培训,高能力从业者选择回避,而低能力用户虽采纳却无法实现生产性使用。实现GenAI生产力增益需要在访问权限与教学指导两方面共同投入。