Identifying the dynamical state variables of a system from high-dimensional observations is a central problem across physical sciences. The challenge is that the state variables are not directly observable and must be inferred from raw high-dimensional data without supervision. Here we introduce DySIB (Dynamical Symmetric Information Bottleneck) as a method to learn low-dimensional representations of time-series data by maximizing predictive mutual information between past and future observation windows while penalizing representation complexity. This objective operates entirely in latent space and avoids reconstruction of the observations. We apply DySIB to an experimental video dataset of a physical pendulum, where the underlying state space is known. The method, with hyperparameters of the learning architecture set self-consistently by the data, recovers a two-dimensional representation that matches the dimensionality, topology, and geometry of the pendulum phase space, with the learned coordinates aligning smoothly with the canonical angle and angular velocity. These results demonstrate, on a well-characterized experimental system, that predictive information in latent space can be used to recover interpretable dynamical coordinates directly from high-dimensional data.
翻译:从高维观测中识别系统的动力学状态变量是物理科学中的核心问题。其挑战在于状态变量无法直接观测,必须从原始高维数据中无监督地推断。本文提出DySIB(动力学对称信息瓶颈)方法,通过最大化过去与未来观测窗口之间的预测互信息,同时惩罚表示复杂度,学习时间序列数据的低维表示。该目标完全在潜在空间中运行,避免了观测数据的重构。我们将DySIB应用于物理摆的实验视频数据集(其底层状态空间已知),该方法通过数据自洽设置学习架构的超参数,恢复出与摆的相空间维度、拓扑和几何结构相匹配的二维表示,且学习到的坐标与标准角度和角速度光滑对齐。这些结果在一个特征明确的实验系统上证明,潜在空间中的预测信息可用于直接从高维数据中恢复可解释的动力学坐标。