AI-based systems such as language models have been shown to replicate and even amplify social biases reflected in their training data. Among other questionable behaviors, this can lead to AI-generated text--and text suggestions--that contain normatively inappropriate stereotypical associations. Little is known, however, about how this behavior impacts the writing produced by people using these systems. We address this gap by measuring how much impact stereotypes or anti-stereotypes in English single-word LM predictive text suggestions have on the stories that people write using those tools in a co-writing scenario. We find that ($n=414$), LM suggestions that challenge stereotypes sometimes lead to a significantly increased rate of anti-stereotypical co-written stories. However, despite this increased rate of anti-stereotypical stories, pro-stereotypical narratives still dominated the co-written stories, demonstrating that technical debiasing is only a partially effective strategy to alleviate harms from human-AI collaboration.
翻译:研究表明,基于人工智能的系统(如语言模型)会复制甚至放大其训练数据中反映的社会偏见。除其他可疑行为外,这可能导致AI生成的文本——以及文本建议——包含规范上不恰当的刻板印象关联。然而,关于这种行为如何影响使用这些系统的人所创作的文本,目前知之甚少。我们通过测量在协同写作场景中,英语单词语言模型预测性文本建议所含的刻板印象或反刻板印象对人们使用这些工具所创作故事的影响程度,来填补这一研究空白。我们发现($n=414$),挑战刻板印象的语言模型建议有时会显著提高反刻板印象协同创作故事的比例。然而,尽管反刻板印象故事的比例有所增加,支持刻板印象的叙述仍在协同创作的故事中占据主导地位,这表明技术去偏见仅是缓解人机协作危害的部分有效策略。