AnthroScore: A Computational Linguistic Measure of Anthropomorphism

Anthropomorphism, or the attribution of human-like characteristics to non-human entities, has shaped conversations about the impacts and possibilities of technology. We present AnthroScore, an automatic metric of implicit anthropomorphism in language. We use a masked language model to quantify how non-human entities are implicitly framed as human by the surrounding context. We show that AnthroScore corresponds with human judgments of anthropomorphism and dimensions of anthropomorphism described in social science literature. Motivated by concerns of misleading anthropomorphism in computer science discourse, we use AnthroScore to analyze 15 years of research papers and downstream news articles. In research papers, we find that anthropomorphism has steadily increased over time, and that papers related to language models have the most anthropomorphism. Within ACL papers, temporal increases in anthropomorphism are correlated with key neural advancements. Building upon concerns of scientific misinformation in mass media, we identify higher levels of anthropomorphism in news headlines compared to the research papers they cite. Since AnthroScore is lexicon-free, it can be directly applied to a wide range of text sources.

翻译：摘要：拟人化，即将类人特征赋予非人实体的做法，深刻影响了关于技术影响与可能性的讨论。我们提出AnthroScore，一种用于衡量语言中隐性拟人化程度的自动度量指标。通过掩码语言模型，我们量化非人实体如何被上下文语境隐含地框架为人类角色。实验表明，AnthroScore与人类对拟人化的判断以及社会科学文献中描述的拟人化维度高度吻合。针对计算机科学话语中因拟人化导致的误导性担忧，我们运用AnthroScore分析了15年来的研究论文及后续新闻报道。研究发现，论文中的拟人化程度随时间推移持续上升，其中与语言模型相关的论文拟人化倾向最为显著。在ACL论文中，拟人化程度的时间增长与关键神经技术进展呈正相关。基于对大众媒体中科学信息失真的担忧，我们进一步发现新闻标题中的拟人化水平显著高于其引用的研究论文。由于AnthroScore不依赖特定词汇库，它可直接应用于多种文本来源。

相关内容

Computational Linguistics

关注 846

计算语言学(Computational Linguistics)是历史最悠久的出版物，专门研究语言的计算和数学特性以及自然语言处理系统的设计和分析。这本备受推崇的季刊为大学和工业界的语言学家、计算语言学家、人工智能和机器学习研究者、认知科学家、语言专家和哲学家提供有关语言研究各个方面的计算方面的最新信息。官网地址：http://dblp.uni-trier.de/db/journals/coling/

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

生成性对抗网络:理论模型、评估指标和最近发展的概述，Generative Adversarial Networks (GANs): An Overview of Theoretical Model, Evaluation Metrics, and Recent Developments

专知会员服务

42+阅读 · 2020年5月30日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日