Discrete distributions are learnable from metastable samples

Physically motivated stochastic dynamics are often used to sample from high-dimensional distributions. However such dynamics often get stuck in specific regions of their state space and mix very slowly to the desired stationary state. This causes such systems to approximately sample from a metastable distribution which is usually quite different from the desired, stationary distribution of the dynamic. We rigorously show that, in the case of multi-variable discrete distributions, the true model describing the stationary distribution can be recovered from samples produced from a metastable distribution under minimal assumptions about the system. This follows from a fundamental observation that the single-variable conditionals of metastable distributions that satisfy a strong metastability condition are on average close to those of the stationary distribution. This holds even when the metastable distribution differs considerably from the true model in terms of global metrics like Kullback-Leibler divergence or total variation distance. This property allows us to learn the true model using a conditional likelihood based estimator, even when the samples come from a metastable distribution concentrated in a small region of the state space. Explicit examples of such metastable states can be constructed from regions that effectively bottleneck the probability flow and cause poor mixing of the Markov chain. For specific cases of binary pairwise undirected graphical models (i.e. Ising models), we extend our results to further rigorously show that data coming from metastable states can be used to learn the parameters of the energy function and recover the structure of the model.

翻译：物理启发的随机动力学常被用于从高维分布中采样。然而，此类动力学常会陷入状态空间的特定区域，并以极慢的速度混合至期望的平稳态。这导致系统实际上是从亚稳态分布中近似采样，而该分布通常与动力学所期望的平稳分布存在显著差异。我们严格证明，在多变量离散分布的情形下，即使仅基于系统的最小假设，仍可从亚稳态分布产生的样本中恢复描述平稳分布的真实模型。这一结论源于一个基本观察：满足强亚稳态条件的亚稳态分布，其单变量条件分布在平均意义上接近平稳分布的条件分布。即使亚稳态分布在全局度量（如Kullback-Leibler散度或总变差距离）上与真实模型存在显著差异，该性质依然成立。这一特性使我们能够基于条件似然估计器学习真实模型，即使样本来自集中于状态空间小区域的亚稳态分布。此类亚稳态的显式实例可从那些有效阻碍概率流并导致马尔可夫链混合不良的区域中构造。针对二元成对无向图模型（即伊辛模型）的特例，我们扩展了研究结果，进一步严格证明：利用来自亚稳态的数据可以学习能量函数的参数并恢复模型的结构。