In this paper, we introduce ObjectZero, a novel reinforcement learning (RL) algorithm that leverages the power of object-level representations to model dynamic environments more effectively. Unlike traditional approaches that process the world as a single undifferentiated input, our method employs Graph Neural Networks (GNNs) to capture intricate interactions among multiple objects. These objects, which can be manipulated and interact with each other, serve as the foundation for our model's understanding of the environment. We trained the algorithm in a complex setting teeming with diverse, interactive objects, demonstrating its ability to effectively learn and predict object dynamics. Our results highlight that a structured world model operating on object-centric representations can be successfully integrated into a model-based RL algorithm utilizing Monte Carlo Tree Search as a planning module.
翻译:本文提出了一种新颖的强化学习算法——ObjectZero,该算法利用目标级表示的力量来更有效地建模动态环境。与将世界视为单一无差别输入的传统方法不同,我们的方法采用图神经网络来捕捉多个目标之间复杂的交互作用。这些可被操纵且能相互交互的目标,构成了我们模型理解环境的基础。我们在一个充满多样交互目标的复杂环境中训练该算法,证明了其有效学习和预测目标动态的能力。我们的结果表明,基于目标中心表示的结构化世界模型,可以成功地集成到使用蒙特卡洛树搜索作为规划模块的基于模型的强化学习算法中。