Humans are interactive agents driven to seek out situations with interesting physical dynamics. Here we formalize the functional form of physical intrinsic motivation. We first collect ratings of how interesting humans find a variety of physics scenarios. We then model human interestingness responses by implementing various hypotheses of intrinsic motivation including models that rely on simple scene features to models that depend on forward physics prediction. We find that the single best predictor of human responses is adversarial reward, a model derived from physical prediction loss. We also find that simple scene feature models do not generalize their prediction of human responses across all scenarios. Finally, linearly combining the adversarial model with the number of collisions in a scene leads to the greatest improvement in predictivity of human responses, suggesting humans are driven towards scenarios that result in high information gain and physical activity.
翻译:人类是具有交互能力的智能体,其行为驱动力源于对具有有趣物理动态场景的探索渴求。本文对物理内在动机的函数形式进行了形式化建模。我们首先收集了人类对各类物理场景兴趣度的评分数据,随后通过实现多种内在动机假说(从依赖简单场景特征的模型到基于前向物理预测的模型)对人类兴趣响应进行建模。研究发现,人类响应的最佳单一预测因子是对抗性奖励——一种源于物理预测损失的模型。同时我们发现,简单场景特征模型无法在所有场景中泛化其预测能力。最终,将对抗性模型与场景碰撞次数进行线性组合,可在人类响应预测性上实现最大程度提升,这表明人类更倾向于追求具有高信息增益与物理活动强度的场景。