Mind the Gap: The Divergence Between Human and LLM-Generated Tasks

Humans constantly generate a diverse range of tasks guided by internal motivations. While generative agents powered by large language models (LLMs) aim to simulate this complex behavior, it remains uncertain whether they operate on similar cognitive principles. To address this, we conducted a task-generation experiment comparing human responses with those of an LLM agent (GPT-4o). We find that human task generation is consistently influenced by psychological drivers, including personal values (e.g., Openness to Change) and cognitive style. Even when these psychological drivers are explicitly provided to the LLM, it fails to reflect the corresponding behavioral patterns. They produce tasks that are markedly less social, less physical, and thematically biased toward abstraction. Interestingly, while the LLM's tasks were perceived as more fun and novel, this highlights a disconnect between its linguistic proficiency and its capacity to generate human-like, embodied goals. We conclude that there is a core gap between the value-driven, embodied nature of human cognition and the statistical patterns of LLMs, highlighting the necessity of incorporating intrinsic motivation and physical grounding into the design of more human-aligned agents.

翻译：人类在内在动机的驱动下不断生成多样化的任务。尽管基于大语言模型（LLM）的生成式智能体旨在模拟这种复杂行为，但其是否遵循相似的认知原理仍不明确。为此，我们设计了一项任务生成实验，比较人类与LLM智能体（GPT-4o）的响应。研究发现，人类任务生成始终受到心理驱动因素（包括个人价值观（如“对变化的开放性”）和认知风格）的影响。即使将这些心理驱动因素明确提供给LLM，它也无法复现相应的行为模式。LLM生成的任务明显缺乏社会性、实体性，且主题偏向抽象化。有趣的是，尽管LLM生成的任务被认为更具趣味性和新颖性，但这恰恰凸显了其语言能力与生成类人具身化目标的能力之间存在脱节。我们得出结论：人类认知的价值驱动与具身化本质与LLM的统计模式之间存在核心差距，这凸显了在设计与人类更契合的智能体时，必须融入内在动机与物理基础。

相关内容

大语言模型

关注 66

大语言模型是基于海量文本数据训练的深度学习模型。它不仅能够生成自然语言文本，还能够深入理解文本含义，处理各种自然语言任务，如文本摘要、问答、翻译等。2023年，大语言模型及其在人工智能领域的应用已成为全球科技研究的热点，其在规模上的增长尤为引人注目，参数量已从最初的十几亿跃升到如今的一万亿。参数量的提升使得模型能够更加精细地捕捉人类语言微妙之处，更加深入地理解人类语言的复杂性。在过去的一年里，大语言模型在吸纳新知识、分解复杂任务以及图文对齐等多方面都有显著提升。随着技术的不断成熟，它将不断拓展其应用范围，为人类提供更加智能化和个性化的服务，进一步改善人们的生活和生产方式。

人机协同作战规划：来自美海军陆战队的大语言模型（LLM）使用教训

专知会员服务

26+阅读 · 2025年10月16日

基于大语言模型的智能体易产生幻觉：分类体系、方法与未来方向综述

专知会员服务

32+阅读 · 2025年9月27日

基于大语言模型（LLM）的智能体推理框架：从方法到场景的综述

专知会员服务

54+阅读 · 2025年8月26日

LLMs与生成式智能体模拟：复杂系统研究的新范式

专知会员服务

28+阅读 · 2025年6月15日