The dynamics of many-body systems can often be captured in terms of only a few relevant variables. Mathematical and numerical approaches exist to identify these variables by exploiting a separation of time scales between slow relevant and fast irrelevant variables, but such a separation of scales is not always obvious or even available. In this work, we introduce an information-theoretic framework for dimensionality reduction in dynamical systems that bypasses this limitation by instead identifying relevant variables based on how predictive they are of the system's future. To do so, we mathematically formalize the intuition that model reduction is about keeping "relevant" information while throwing away "irrelevant" information. We characterize the solution of the resulting optimization problem and prove that it reduces to standard approaches when a separation of time scales is indeed present in the dynamics. Importantly, we find that within this framework, the problems of identifying relevant variables and identifying their effective dynamics decouple and may be solved separately. This makes the method tractable in practice and enables us to derive dimensionally-reduced variables from data with neural networks. Combined with existing equation learning methods, the procedure introduced in this work reveals the dynamical rules governing the system's evolution in a data-driven manner. We illustrate these tools in diverse settings including simulated chaotic systems, uncurated satellite recordings of atmospheric fluid flows, and experimental videos of cyanobacteria colonies in which we discover an emergent synchronization order parameter.
翻译:多体系统的动力学通常仅通过少数相关变量即可描述。目前已有数学与数值方法通过利用慢速相关变量与快速无关变量之间的时间尺度分离来识别这些变量,但这种尺度分离并非总是明显存在,甚至可能完全缺失。本研究提出一种基于信息论的降维框架,突破这一局限,通过衡量变量对系统未来状态的预测能力来识别相关变量。为此,我们数学形式化了模型降维的本质——保留"相关"信息并丢弃"无关"信息。我们对所得优化问题的解进行刻画,证明当动力学确实存在时间尺度分离时,该解可简化为标准方法。重要的是,我们发现该框架下识别相关变量与建立其有效动力学这两个问题可以解耦并分别求解,这使得方法在实际中具有可操作性,并可借助神经网络从数据中提取降维变量。结合现有方程学习方法,本工作提出的流程能以数据驱动方式揭示系统演化的动力学法则。我们通过模拟混沌系统、大气流体未筛选卫星记录以及蓝藻菌落实验视频(从中发现涌现的同步序参量)等多种场景验证了这些工具的有效性。