Large Language Models (LLMs) have demonstrated remarkable capabilities in reasoning and generation, serving as the foundation for advanced persona simulation and Role-Playing Language Agents (RPLAs). However, achieving authentic alignment with human cognitive and behavioral patterns remains a critical challenge for these agents. We present HumanLLM, a framework treating psychological patterns as interacting causal forces. We construct 244 patterns from ~12,000 academic papers and synthesize 11,359 scenarios where 2-5 patterns reinforce, conflict, or modulate each other, with multi-turn conversations expressing inner thoughts, actions, and dialogue. Our dual-level checklists evaluate both individual pattern fidelity and emergent multi-pattern dynamics, achieving strong human alignment (r=0.91) while revealing that holistic metrics conflate simulation accuracy with social desirability. HumanLLM-8B outperforms Qwen3-32B on multi-pattern dynamics despite 4x fewer parameters, demonstrating that authentic anthropomorphism requires cognitive modeling--simulating not just what humans do, but the psychological processes generating those behaviors.
翻译:大语言模型(LLMs)在推理与生成方面展现出卓越能力,为高级角色模拟和角色扮演语言智能体(RPLAs)奠定了基础。然而,如何实现与人类认知行为模式的真实对齐仍是此类智能体面临的核心挑战。本文提出HumanLLM框架,将心理模式视为相互作用的因果力。我们从约12,000篇学术文献中构建了244种心理模式,并合成了11,359个涉及2-5种模式相互强化、冲突或调节的场景,通过多轮对话呈现内在思维、行动与对话。我们设计的双层检查表同时评估个体模式保真度与多模式涌现动态,在实现高度人类对齐(r=0.91)的同时,揭示整体指标将模拟准确度与社会期望性混为一谈的问题。HumanLLM-8B模型在参数规模仅为Qwen3-32B四分之一的情况下,于多模式动态评估中表现更优,证明真实的拟人化需要认知建模——不仅要模拟人类行为,更需模拟产生这些行为的心理过程。