Human communication is motivated: people speak, write, and create content with a particular communicative intent in mind. As a result, information that large language models (LLMs) and AI agents process is inherently framed by humans' intentions and incentives. People are adept at navigating such nuanced information: we routinely identify benevolent or self-serving motives in order to decide what statements to trust. For LLMs to be effective in the real world, they too must critically evaluate content by factoring in the motivations of the source -- for instance, weighing the credibility of claims made in a sales pitch. In this paper, we undertake a comprehensive study of whether LLMs have this capacity for motivational vigilance. We first employ controlled experiments from cognitive science to verify that LLMs' behavior is consistent with rational models of learning from motivated testimony, and find they successfully discount information from biased sources in a human-like manner. We then extend our evaluation to sponsored online adverts, a more naturalistic reflection of LLM agents' information ecosystems. In these settings, we find that LLMs' inferences do not track the rational models' predictions nearly as closely -- partly due to additional information that distracts them from vigilance-relevant considerations. However, a simple steering intervention that boosts the salience of intentions and incentives substantially increases the correspondence between LLMs and the rational model. These results suggest that LLMs possess a basic sensitivity to the motivations of others, but generalizing to novel real-world settings will require further improvements to these models.
翻译:人类交流具有动机性:人们说话、写作和创作内容时都带有特定的交流意图。因此,大型语言模型(LLMs)和人工智能代理处理的信息本质上受到人类意图和动机的框架约束。人类擅长应对这种微妙信息:我们通常能识别善意或利己动机,从而决定信任哪些陈述。为使LLMs在现实世界中有效运作,它们也必须通过考量信息源的动机来批判性评估内容——例如,权衡销售宣传中主张的可信度。本文对LLMs是否具备这种动机警觉能力进行了系统研究。我们首先采用认知科学的受控实验验证LLMs的行为是否符合理性动机证言学习模型,发现它们能以类人方式成功降低对偏倚信息源的依赖。随后我们将评估扩展到赞助性在线广告——这是LLM代理信息生态中更贴近现实的反映。在这些场景中,我们发现LLMs的推理与理性模型预测的吻合度显著降低,部分原因是额外信息分散了其对警觉相关因素的注意力。然而,通过简单的引导干预增强意图与动机的显著性,能大幅提升LLMs与理性模型的一致性。这些结果表明LLMs具备对他人动机的基本敏感性,但要推广到新颖的现实场景仍需进一步改进模型。