Generative models are spearheading recent progress in deep learning, showcasing strong promise for trajectory sampling in dynamical systems as well. However, whereas latent space modeling paradigms have transformed image and video generation, similar approaches are more difficult for most dynamical systems. Such systems -- from chemical molecule structures to collective human behavior -- are described by interactions of entities, making them inherently linked to connectivity patterns, entity conservation, and the traceability of entities over time. Our approach, LaM-SLidE (Latent Space Modeling of Spatial Dynamical Systems via Linked Entities), bridges the gap between: (1) keeping the traceability of individual entities in a latent system representation, and (2) leveraging the efficiency and scalability of recent advances in image and video generation, where pre-trained encoder and decoder enable generative modeling directly in latent space. The core idea of LaM-SLidE is the introduction of identifier representations (IDs) that enable the retrieval of entity properties and entity composition from latent system representations, thus fostering traceability. Experimentally, across different domains, we show that LaM-SLidE performs favorably in terms of speed, accuracy, and generalizability. Code is available at https://github.com/ml-jku/LaM-SLidE .
翻译:生成模型正在引领深度学习的最新进展,在动力学系统的轨迹采样方面也展现出巨大潜力。然而,尽管潜在空间建模范式已彻底改变了图像与视频生成领域,但类似方法对于大多数动力学系统而言实施难度更高。此类系统——从化学分子结构到集体人类行为——均通过实体间的相互作用进行描述,使其本质上与连接模式、实体守恒性以及实体随时间的可追踪性相关联。我们的方法LaM-SLidE(基于关联实体的空间动力学系统潜在空间建模)弥合了以下两方面的鸿沟:(1) 在潜在系统表征中保持个体实体的可追踪性;(2) 充分利用图像与视频生成领域最新进展的高效性与可扩展性,其中预训练的编码器与解码器支持直接在潜在空间进行生成建模。LaM-SLidE的核心思想在于引入标识符表征(ID),使得能够从潜在系统表征中检索实体属性与实体组合,从而保障可追踪性。实验表明,在不同领域中,LaM-SLidE在速度、准确性与泛化能力方面均表现出优越性能。代码发布于https://github.com/ml-jku/LaM-SLidE。