We present seven experiments exploring gender biases in GPT. Initially, GPT was asked to generate demographics of a potential writer of twenty phrases containing feminine stereotypes and twenty with masculine stereotypes. Results show a strong asymmetry, with stereotypically masculine sentences attributed to a female more often than vice versa. For example, the sentence "I love playing fotbal! Im practicing with my cosin Michael" was constantly assigned by ChatGPT to a female writer. This phenomenon likely reflects that while initiatives to integrate women in traditionally masculine roles have gained momentum, the reverse movement remains relatively underdeveloped. Subsequent experiments investigate the same issue in high-stakes moral dilemmas. GPT-4 finds it more appropriate to abuse a man to prevent a nuclear apocalypse than to abuse a woman. This bias extends to other forms of violence central to the gender parity debate (abuse), but not to those less central (torture). Moreover, this bias increases in cases of mixed-sex violence for the greater good: GPT-4 agrees with a woman using violence against a man to prevent a nuclear apocalypse but disagrees with a man using violence against a woman for the same purpose. Finally, these biases are implicit, as they do not emerge when GPT-4 is directly asked to rank moral violations. These results highlight the necessity of carefully managing inclusivity efforts to prevent unintended discrimination.
翻译:我们通过七项实验探究GPT中的性别偏见。首先,我们要求GPT生成二十句包含女性刻板印象和二十句包含男性刻板印象短语的潜在作者人口统计学特征。结果显示存在强烈的不对称性:刻板印象中的男性化句子被归因于女性作者的概率显著高于反向情况。例如,ChatGPT持续将“I love playing fotbal! Im practicing with my cosin Michael”这句话判定为女性作者所写。这一现象可能反映出,尽管推动女性进入传统男性主导领域的倡议已取得进展,但反向运动仍相对滞后。后续实验在高风险道德困境中研究了同一问题。GPT-4认为为防止核末日而虐待男性比虐待女性更为合理。这种偏见延伸至性别平等辩论的核心暴力形式(虐待),但未延伸至非核心形式(折磨)。此外,在“为更大利益而实施的混合性别暴力”情境中,这种偏见更为显著:GPT-4赞同女性为阻止核末日对男性使用暴力,却反对男性为相同目的对女性使用暴力。最后,这些偏见具有隐含性,因为当直接要求GPT-4对道德违规行为进行排序时,此类偏见并未显现。这些结果凸显了需谨慎管理包容性措施以防止意外歧视的必要性。