Markov jump processes are continuous-time stochastic processes which describe dynamical systems evolving in discrete state spaces. These processes find wide application in the natural sciences and machine learning, but their inference is known to be far from trivial. In this work we introduce a methodology for zero-shot inference of Markov jump processes (MJPs), on bounded state spaces, from noisy and sparse observations, which consists of two components. First, a broad probability distribution over families of MJPs, as well as over possible observation times and noise mechanisms, with which we simulate a synthetic dataset of hidden MJPs and their noisy observation process. Second, a neural network model that processes subsets of the simulated observations, and that is trained to output the initial condition and rate matrix of the target MJP in a supervised way. We empirically demonstrate that one and the same (pretrained) model can infer, in a zero-shot fashion, hidden MJPs evolving in state spaces of different dimensionalities. Specifically, we infer MJPs which describe (i) discrete flashing ratchet systems, which are a type of Brownian motors, and the conformational dynamics in (ii) molecular simulations, (iii) experimental ion channel data and (iv) simple protein folding models. What is more, we show that our model performs on par with state-of-the-art models which are finetuned to the target datasets.
翻译:马尔可夫跳跃过程是描述在离散状态空间中演化的动力系统的连续时间随机过程。这类过程在自然科学和机器学习中有着广泛应用,但其推理问题众所周知远非平凡。本文提出了一种针对有界状态空间马尔可夫跳跃过程(MJP)的零样本推理方法,该方法可从噪声且稀疏的观测数据中进行推理,其包含两个核心组件。首先,我们构建了一个覆盖MJP族、观测时间及噪声机制的广泛概率分布,并以此模拟生成包含隐藏MJP及其噪声观测过程的合成数据集。其次,我们设计了一个神经网络模型,该模型处理模拟观测数据的子集,并通过监督式训练输出目标MJP的初始条件与速率矩阵。我们通过实验证明,同一个(预训练)模型能够以零样本方式推理在不同维度状态空间中演化的隐藏MJP。具体而言,我们成功推理了描述以下系统的MJP:(i)作为布朗马达一种的离散闪烁棘轮系统;(ii)分子模拟中的构象动力学;(iii)实验离子通道数据;以及(iv)简单蛋白质折叠模型。此外,我们的模型性能与针对目标数据集进行微调的最先进模型相当。