Anthropomorphism, or the attribution of human-like characteristics to non-human entities, has shaped conversations about the impacts and possibilities of technology. We present AnthroScore, an automatic metric of implicit anthropomorphism in language. We use a masked language model to quantify how non-human entities are implicitly framed as human by the surrounding context. We show that AnthroScore corresponds with human judgments of anthropomorphism and dimensions of anthropomorphism described in social science literature. Motivated by concerns of misleading anthropomorphism in computer science discourse, we use AnthroScore to analyze 15 years of research papers and downstream news articles. In research papers, we find that anthropomorphism has steadily increased over time, and that papers related to language models have the most anthropomorphism. Within ACL papers, temporal increases in anthropomorphism are correlated with key neural advancements. Building upon concerns of scientific misinformation in mass media, we identify higher levels of anthropomorphism in news headlines compared to the research papers they cite. Since AnthroScore is lexicon-free, it can be directly applied to a wide range of text sources.
翻译:摘要:拟人化,即将类人特征赋予非人实体的做法,深刻影响了关于技术影响与可能性的讨论。我们提出AnthroScore,一种用于衡量语言中隐性拟人化程度的自动度量指标。通过掩码语言模型,我们量化非人实体如何被上下文语境隐含地框架为人类角色。实验表明,AnthroScore与人类对拟人化的判断以及社会科学文献中描述的拟人化维度高度吻合。针对计算机科学话语中因拟人化导致的误导性担忧,我们运用AnthroScore分析了15年来的研究论文及后续新闻报道。研究发现,论文中的拟人化程度随时间推移持续上升,其中与语言模型相关的论文拟人化倾向最为显著。在ACL论文中,拟人化程度的时间增长与关键神经技术进展呈正相关。基于对大众媒体中科学信息失真的担忧,我们进一步发现新闻标题中的拟人化水平显著高于其引用的研究论文。由于AnthroScore不依赖特定词汇库,它可直接应用于多种文本来源。