User interactions with language models vary due to static properties of the user (trait) and the specific context of the interaction (state). However, existing persona datasets (like PersonaChat, PANDORA etc.) capture only trait, and ignore the impact of state. We introduce Chameleon, a dataset of 5,001 contextual psychological profiles from 1,667 Reddit users, each measured across multiple contexts. Using the Chameleon dataset, we present three key findings. First, inspired by Latent State-Trait theory, we decompose variance and find that 74\% is within-person(state) while only 26\% is between-person (trait). Second, we find that LLMs are state-blind: they focus on trait only, and produce similar responses regardless of state. Third, we find that reward models react to user state, but inconsistently: different models favor or penalize the same users in opposite directions. We release Chameleon to support research on affective computing, personalized dialogue, and RLHF alignment.
翻译:用户与语言模型的交互差异源于用户的静态属性(特质)和交互的具体情境(状态)。然而,现有角色数据集(如PersonaChat、PANDORA等)仅捕捉特质,而忽略了状态的影响。我们提出了Chameleon数据集,该数据集包含来自1,667名Reddit用户的5,001个情境化心理画像,每个用户均在多个情境下进行测量。基于Chameleon数据集,我们提出了三个关键发现。首先,受潜状态-特质理论启发,我们对方差进行分解,发现74%的方差源于个体内部(状态),而仅有26%源于个体之间(特质)。其次,我们发现大型语言模型对状态视而不见:它们仅关注特质,无论状态如何均产生相似的回答。第三,我们发现奖励模型对用户状态有反应,但反应不一致:不同模型对同一用户的偏好或惩罚方向相反。我们公开Chameleon数据集,以支持情感计算、个性化对话和RLHF对齐领域的研究。