Inference in conditioned dynamics through causality restoration

Computing observables from conditioned dynamics is typically computationally hard, because, although obtaining independent samples efficiently from the unconditioned dynamics is usually feasible, generally most of the samples must be discarded (in a form of importance sampling) because they do not satisfy the imposed conditions. Sampling directly from the conditioned distribution is non-trivial, as conditioning breaks the causal properties of the dynamics which ultimately renders the sampling procedure efficient. One standard way of achieving it is through a Metropolis Monte-Carlo procedure, but this procedure is normally slow and a very large number of Monte-Carlo steps is needed to obtain a small number of statistically independent samples. In this work, we propose an alternative method to produce independent samples from a conditioned distribution. The method learns the parameters of a generalized dynamical model that optimally describe the conditioned distribution in a variational sense. The outcome is an effective, unconditioned, dynamical model, from which one can trivially obtain independent samples, effectively restoring causality of the conditioned distribution. The consequences are twofold: on the one hand, it allows us to efficiently compute observables from the conditioned dynamics by simply averaging over independent samples. On the other hand, the method gives an effective unconditioned distribution which is easier to interpret. The method is flexible and can be applied virtually to any dynamics. We discuss an important application of the method, namely the problem of epidemic risk assessment from (imperfect) clinical tests, for a large family of time-continuous epidemic models endowed with a Gillespie-like sampler. We show that the method compares favorably against the state of the art, including the soft-margin approach and mean-field methods.

翻译：从条件动力学中计算可观测量通常在计算上极其困难，因为尽管从非条件动力学中高效获取独立样本通常是可行的，但大多数样本必须被丢弃（以重要性采样的形式），因为它们不满足施加的条件。直接从条件分布中采样并非易事，因为条件化破坏了动力学中的因果性质，而这正是使采样过程有效的原因。实现这一目标的标准方法之一是使用Metropolis蒙特卡洛算法，但该过程通常较慢，且需要大量蒙特卡洛步骤才能获得少量统计独立的样本。在本工作中，我们提出了一种从条件分布中生成独立样本的替代方法。该方法学习一个广义动力学模型的参数，以变分意义上最优地描述条件分布。最终得到一个有效的、非条件化动力学模型，从中可以轻松获得独立样本，从而有效恢复条件分布的因果性。这一成果具有双重意义：一方面，它允许我们通过对独立样本进行简单平均来高效计算条件动力学中的可观测量；另一方面，该方法给出了一个更易于解释的有效非条件分布。该方法具有灵活性，可广泛应用于几乎任何动力学系统。我们讨论了该方法的一个重要应用，即基于（不完美的）临床测试进行流行病风险评估问题，适用于一大类具备吉莱斯皮式采样器的连续时间流行病模型。结果表明，该方法相较于现有技术（包括软间隔方法和平均场方法）具有显著优势。