Most neural models of causality assume static causal graphs, failing to capture the dynamic and sparse nature of physical interactions where causal relationships emerge and dissolve over time. We introduce the Causal Process Framework and its neural implementation, Causal Process Models (CPMs), for learning sparse, time-varying causal graphs from visual observations. Unlike traditional approaches that maintain dense connectivity, our model explicitly constructs causal edges only when objects actively interact, dramatically improving both interpretability and computational efficiency. We achieve this by casting dynamic interaction-graph construction for world modeling as a multi-agent reinforcement learning problem, where specialized agents sequentially decide which objects are causally connected at each timestep. Our key innovation is a structured representation that factorizes object and force vectors along three learned dimensions (mutability, causal relevance, and control relevance), enabling the automatic discovery of semantically meaningful encodings. We demonstrate that a CPM significantly outperforms dense graph baselines on physical prediction tasks, particularly for longer horizons and varying object counts.
翻译:大多数因果关系的神经模型假设静态因果图,未能捕捉物理交互中因果关系的动态与稀疏性——这种关系随时间产生和消散。我们提出因果过程框架及其神经实现——因果过程模型(CPMs),用于从视觉观测中学习稀疏、时变的因果图。与保持稠密连接的传统方法不同,我们的模型仅在物体主动交互时显式构建因果边,显著提升了可解释性与计算效率。我们通过将世界建模中的动态交互图构建转化为多智能体强化学习问题实现这一目标:专用智能体按序决定每个时间步中哪些物体存在因果联系。我们的核心创新是结构化表示,将物体与力向量沿三个学习维度(可变性、因果相关性与控制相关性)进行分解,从而自动发现具有语义意义的编码。实验表明,在物理预测任务中,CPM显著优于稠密图基线方法,尤其在长时域及变物体数量的场景下表现更为突出。