Generative AI tools are increasingly used to create portrayals of people in occupations, raising concerns about how race and gender are represented. We conducted a large-scale audit of over 1.5 million occupational personas across 41 U.S. occupations, generated by four large language models with different AI safety commitments and countries of origin (U.S., China, France). Compared with Bureau of Labor Statistics data, we find two recurring patterns: systematic shifts, where some groups are consistently under- or overrepresented, and stereotype exaggeration, where existing demographic skews are amplified. On average, White (--31pp) and Black (--9pp) workers are underrepresented, while Hispanic (+17pp) and Asian (+12pp) workers are overrepresented. These distortions can be extreme: for example, across all four models, Housekeepers are portrayed as nearly 100\% Hispanic, while Black workers are erased from many occupations. For HCI, these findings show provider choice materially changes who is visible, motivating model-specific audits and accountable design practices.
翻译:生成式人工智能工具正日益广泛地用于创建职业人物形象,引发了关于种族与性别如何被表征的担忧。我们对超过150万个覆盖41种美国职业的人物角色进行了大规模审计,这些角色由四个具有不同AI安全承诺及来源国(美国、中国、法国)的大型语言模型生成。与美国劳工统计局的数据相比,我们发现了两种反复出现的模式:系统性偏移(某些群体持续被低度或过度表征)和刻板印象放大(现有的人口统计偏差被加剧)。平均而言,白人(-31pp)和黑人(-9pp)工作者被低度表征,而西班牙裔(+17pp)和亚裔(+12pp)工作者被过度表征。这些扭曲可能极为严重:例如,在所有四个模型中,家政服务员被描绘为近100%的西班牙裔,而黑人工作者则从许多职业中被抹除。对人机交互领域而言,这些发现表明提供者的选择实质性地改变了可见性,这促使了针对特定模型的审计和可问责的设计实践。