Humans constantly generate a diverse range of tasks guided by internal motivations. While generative agents powered by large language models (LLMs) aim to simulate this complex behavior, it remains uncertain whether they operate on similar cognitive principles. To address this, we conducted a task-generation experiment comparing human responses with those of an LLM agent (GPT-4o). We find that human task generation is consistently influenced by psychological drivers, including personal values (e.g., Openness to Change) and cognitive style. Even when these psychological drivers are explicitly provided to the LLM, it fails to reflect the corresponding behavioral patterns. They produce tasks that are markedly less social, less physical, and thematically biased toward abstraction. Interestingly, while the LLM's tasks were perceived as more fun and novel, this highlights a disconnect between its linguistic proficiency and its capacity to generate human-like, embodied goals. We conclude that there is a core gap between the value-driven, embodied nature of human cognition and the statistical patterns of LLMs, highlighting the necessity of incorporating intrinsic motivation and physical grounding into the design of more human-aligned agents.
翻译:人类在内在动机的驱动下不断生成多样化的任务。尽管基于大语言模型(LLM)的生成式智能体旨在模拟这种复杂行为,但其是否遵循相似的认知原理仍不明确。为此,我们设计了一项任务生成实验,比较人类与LLM智能体(GPT-4o)的响应。研究发现,人类任务生成始终受到心理驱动因素(包括个人价值观(如“对变化的开放性”)和认知风格)的影响。即使将这些心理驱动因素明确提供给LLM,它也无法复现相应的行为模式。LLM生成的任务明显缺乏社会性、实体性,且主题偏向抽象化。有趣的是,尽管LLM生成的任务被认为更具趣味性和新颖性,但这恰恰凸显了其语言能力与生成类人具身化目标的能力之间存在脱节。我们得出结论:人类认知的价值驱动与具身化本质与LLM的统计模式之间存在核心差距,这凸显了在设计与人类更契合的智能体时,必须融入内在动机与物理基础。