We present COmpetitive Mechanisms for Efficient Transfer (COMET), a modular world model which leverages reusable, independent mechanisms across different environments. COMET is trained on multiple environments with varying dynamics via a two-step process: competition and composition. This enables the model to recognise and learn transferable mechanisms. Specifically, in the competition phase, COMET is trained with a winner-takes-all gradient allocation, encouraging the emergence of independent mechanisms. These are then re-used in the composition phase, where COMET learns to re-compose learnt mechanisms in ways that capture the dynamics of intervened environments. In so doing, COMET explicitly reuses prior knowledge, enabling efficient and interpretable adaptation. We evaluate COMET on environments with image-based observations. In contrast to competitive baselines, we demonstrate that COMET captures recognisable mechanisms without supervision. Moreover, we show that COMET is able to adapt to new environments with varying numbers of objects with improved sample efficiency compared to more conventional finetuning approaches.
翻译:我们提出了用于高效迁移的竞争机制(COMET),这是一种模块化世界模型,能够在不同环境中利用可复用的独立机制。COMET通过竞争与组合两步过程在具有动态变化的多环境中进行训练,使模型能够识别并学习可迁移机制。具体而言,在竞争阶段,COMET采用赢家通吃的梯度分配策略进行训练,促进独立机制的形成。这些机制随后在组合阶段被复用:COMET学习通过重构已习得的机制来捕捉干预环境的动态特征。通过这种方式,COMET显式地复用了先验知识,实现了高效且可解释的适应能力。我们在基于图像观测的环境中对COMET进行评估。与竞争性基线相比,我们证明COMET能够在无监督情况下捕获可辨识的机制。此外,我们表明COMET能够适应包含不同数量物体的新环境,与传统微调方法相比具有更优的样本效率。