Many important tasks are defined in terms of object. To generalize across these tasks, a reinforcement learning (RL) agent needs to exploit the structure that the objects induce. Prior work has either hard-coded object-centric features, used complex object-centric generative models, or updated state using local spatial features. However, these approaches have had limited success in enabling general RL agents. Motivated by this, we introduce "Feature-Attending Recurrent Modules" (FARM), an architecture for learning state representations that relies on simple, broadly applicable inductive biases for capturing spatial and temporal regularities. FARM learns a state representation that is distributed across multiple modules that each attend to spatiotemporal features with an expressive feature attention mechanism. We show that this improves an RL agent's ability to generalize across object-centric tasks. We study task suites in both 2D and 3D environments and find that FARM better generalizes compared to competing architectures that leverage attention or multiple modules.
翻译:许多重要任务都以对象为定义。为了在这些任务间实现泛化,强化学习智能体需要利用对象所诱导的结构。先前的工作要么硬编码以对象为中心的特征,要么使用复杂的以对象为中心的生成模型,或者利用局部空间特征更新状态。然而,这些方法在实现通用强化学习智能体方面的成功有限。受此启发,我们引入了“特征关注循环模块”(FARM),一种学习状态表示的架构,它依赖简单且广泛适用的归纳偏置来捕获空间和时间规律。FARM学习一种分布在多个模块间的状态表示,每个模块通过一种表达性的特征关注机制关注时空特征。我们表明这提高了强化学习智能体在面向对象任务中的泛化能力。我们研究了2D和3D环境下的任务集,发现与利用注意力或多个模块的竞争架构相比,FARM具有更好的泛化性能。