Stereotypes inform how we present ourselves and others, and in turn how we behave. They are thus important to measure. Recent work has used projections of embeddings from Distributional Semantic Models (DSMs), such as BERT, to perform these measurements. However, DSMs capture cognitive associations that are not necessarily relevant to the interpersonal nature of stereotyping. Here, we propose and evaluate three novel, entity-centric methods for learning stereotypes from Twitter and Wikipedia biographies. Models are trained by leveraging the fact that multiple phrases are applied to the same person, magnifying the person-centric nature of the learned associations. We show that these models outperform existing approaches to stereotype measurement with respect to 1) predicting which identities people apply to themselves and others, and 2) quantifying stereotypes on salient social dimensions (e.g. gender). Via a case study, we also show the utility of these models for future questions in computational social science.
翻译:刻板印象影响着我们如何展现自我与他人,并进一步塑造我们的行为方式。因此,对刻板印象进行量化研究具有重要意义。近期研究采用分布式语义模型(如BERT)的嵌入投影进行此类测量。然而,分布式语义模型捕捉的认知关联未必与刻板印象的人际本质相关。本文提出并评估三种基于实体的新型刻板印象学习框架,通过挖掘Twitter与维基百科传记数据实现。模型训练的核心策略是利用同一人物被多重短语描述的特性,放大学习关联中的人物中心特征。研究表明,相较现有刻板印象测量方法,本模型在以下两方面表现更优:1)预测个体对自我及他人的身份认知;2)量化性别等显著社会维度的刻板印象。通过案例研究,我们进一步展示了该模型在计算社会科学未来研究中的实用价值。