Model-based Reinforcement Learning (MBRL) has achieved remarkable success in continuous control by leveraging latent world models. However, prevailing approaches typically rely on monolithic latent dynamics, entangling environment dynamics into a coupled process. This coupling severely limits reusability: altering the agent necessitates retraining the entire world from scratch, even if the environment remains constant. To address this, we introduce BRICKS-WM (Building Reusability via Interface Composition Kinetics for Structured World Models), a framework for the modular assembly of structured world models. Driven by the insight that the physical world is composed of independent entities, we posit that global dynamics can be modeled as a composition of distinct dynamical modules interacting via latent interfaces. As a minimal instantiation, we factorize the latent state space into an actuated Agent module and an external Background module, bridged by a learned latent interface. Unlike prior object-centric methods that prioritize visual segmentation, BRICKS-WM enforces a functional separation in transition dynamics, ensuring that background dynamics remains agnostic to the agent's dynamics. Empirically, BRICKS-WM achieves control performance comparable to strong monolithic baselines when trained from scratch, and enables the reuse of frozen background dynamics across agents.
翻译:基于模型的强化学习(MBRL)通过利用潜在世界模型在连续控制任务中取得了显著成功。然而,现有方法通常依赖单一化的潜在动力学机制,将环境动态耦合为整体过程。这种耦合性严重限制了可重用性:即便环境保持不变,改变智能体也需要从头重新训练整个世界模型。为解决此问题,我们提出BRICKS-WM(基于接口组合动力学的结构化世界模型构建可重用性框架),一种用于模块化组装结构化世界模型的方法。受物理世界由独立实体构成的启发,我们认为全局动力学可建模为通过潜在接口交互的独立动力学模块的组合。作为最小化实例化,我们将潜在状态空间分解为被驱动的智能体模块和外部的背景模块,两者通过学得的潜在接口连接。不同于以往强调视觉分割的面向对象方法,BRICKS-WM在转移动力学中强制执行功能分离,确保背景动态对智能体动态保持不可知性。实验表明,BRICKS-WM从头训练时可达到与强单一化基线相当的控制性能,并支持跨智能体重用冻结的背景动力学。