To solve control problems via model-based reasoning or planning, an agent needs to know how its actions affect the state of the world. The actions an agent has at its disposal often change the state of the environment in systematic ways. However, existing techniques for world modelling do not guarantee that the effect of actions are represented in such systematic ways. We introduce the Parsimonious Latent Space Model (PLSM), a world model that regularizes the latent dynamics to make the effect of the agent's actions more predictable. Our approach minimizes the mutual information between latent states and the change that an action produces in the agent's latent state, in turn minimizing the dependence the state has on the dynamics. This makes the world model softly state-invariant. We combine PLSM with different model classes used for i) future latent state prediction, ii) planning, and iii) model-free reinforcement learning. We find that our regularization improves accuracy, generalization, and performance in downstream tasks, highlighting the importance of systematic treatment of actions in world models.
翻译:为通过基于模型的推理或规划解决控制问题,智能体需要了解其动作如何影响世界状态。智能体可支配的动作通常以系统性的方式改变环境状态。然而,现有世界建模技术无法保证动作效应以这种系统性方式表征。本文提出简约潜在空间模型(PLSM),这是一种通过正则化潜在动力学使智能体动作效应更具可预测性的世界模型。我们的方法最小化潜在状态与动作在智能体潜在状态中所产生变化之间的互信息,从而降低状态对动力学特性的依赖性,使世界模型具备软状态不变性。我们将PLSM与三类模型结合,分别用于:i) 潜在状态预测,ii) 规划,以及iii) 无模型强化学习。实验表明,我们的正则化方法能提升下游任务的准确性、泛化能力和性能,这凸显了在世界模型中对动作进行系统性处理的重要性。