We examine LLM representations of gender for first names in various occupational contexts to study how occupations and the gender perception of first names in LLMs influence each other mutually. We find that LLMs' first-name gender representations correlate with real-world gender statistics associated with the name, and are influenced by the co-occurrence of stereotypically feminine or masculine occupations. Additionally, we study the influence of first-name gender representations on LLMs in a downstream occupation prediction task and their potential as an internal metric to identify extrinsic model biases. While feminine first-name embeddings often raise the probabilities for female-dominated jobs (and vice versa for male-dominated jobs), reliably using these internal gender representations for bias detection remains challenging.
翻译:本研究考察大语言模型在不同职业语境下对名字性别的表征方式,以探究职业信息与大语言模型对名字的性别感知如何相互影响。研究发现:大语言模型对名字性别的表征与实际世界中该名字对应的性别统计特征存在相关性,并受到共现的刻板印象中女性化或男性化职业的影响。此外,我们通过下游职业预测任务探究了名字性别表征对大语言模型的影响,并评估其作为识别外部模型偏差的内在度量指标的潜力。虽然女性化名字嵌入通常会提高女性主导职业的预测概率(男性化名字对男性主导职业亦有类似效应),但可靠利用这些内在性别表征进行偏差检测仍面临挑战。