Information theory is a powerful framework to capture aspects of dynamical systems with multiple degrees of freedom. Mathematically, the dynamics can be represented as a continuous curve $\mathcal{C}$ on a suitable hyperplane in flat space and the Fisher information provides the norm of an infinitesimal displacement along this curve. In many applications, however, we do not have direct access to $\mathcal{C}$. Instead, we have to reconstruct the latter from a time-series of measurements (obtained as samples of size $n$), which are represented by an ordered set of points $\widehat{\mathcal{C}}$ on the same hyperplane. In this work, we calculate the bias of the Fisher information for large $n$, which provides a quantitative estimation for how accurately the dynamics of a system can be reconstructed from a given set of sampled data. Based on this result, we show that a clustering of the degrees of freedom reduces the bias and thus improves the accuracy with which the new system can be described with the same data. Inspired by a recent proposal for such a clustering, we provide a quantitive assessment of the loss of information, which allows to estimate how much information about the dynamics of a system can reliably be extracted based on a given set of data. We illustrate our findings in the case of a simple compartmental model. Although the latter is inspired by epidemiology, the results of this work are applicable to very general dynamical models with multiple degrees of freedom.
翻译:信息论是一个强大的框架,用于捕捉具有多个自由度的动力学系统的各个方面。在数学上,动力学可以表示为平坦空间中某个适当超平面上的连续曲线$\mathcal{C}$,而Fisher信息提供了沿该曲线无穷小位移的范数。然而,在许多应用中,我们无法直接访问$\mathcal{C}$。相反,我们必须从测量时间序列(以大小为$n$的样本形式获得)中重建后者,这些测量由同一超平面上的有序点集$\widehat{\mathcal{C}}$表示。在本工作中,我们计算了大$n$情况下Fisher信息的偏差,这提供了定量估计,用于衡量从给定采样数据集中重建系统动力学的准确性。基于这一结果,我们表明,自由度的聚类减少了偏差,从而提高了用相同数据描述新系统的准确性。受近期关于这种聚类提议的启发,我们提供了信息损失的定量评估,从而能够估计基于给定数据集可靠提取的系统动力学信息量。我们以一个简单的分区模型为例说明了我们的发现。尽管该模型受流行病学启发,但本工作的结果适用于具有多个自由度的非常一般的动力学模型。