We consider the problem of inferring latent stochastic differential equations (SDEs) with a time and memory cost that scales independently with the amount of data, the total length of the time series, and the stiffness of the approximate differential equations. This is in stark contrast to typical methods for inferring latent differential equations which, despite their constant memory cost, have a time complexity that is heavily dependent on the stiffness of the approximate differential equation. We achieve this computational advancement by removing the need to solve differential equations when approximating gradients using a novel amortization strategy coupled with a recently derived reparametrization of expectations under linear SDEs. We show that, in practice, this allows us to achieve similar performance to methods based on adjoint sensitivities with more than an order of magnitude fewer evaluations of the model in training.
翻译:我们考虑推断潜在随机微分方程(SDEs)的问题,其时间与内存成本与数据量、时间序列总长度及近似微分方程的刚性程度无关。这与典型的潜在微分方程推断方法形成鲜明对比——尽管这些方法具有恒定内存成本,但其时间复杂性严重依赖于近似微分方程的刚性。我们通过消除在梯度近似中求解微分方程的需求来实现这一计算进步,其基于一种新颖的摊还策略,并结合近期推导的线性SDE下期望的重参数化方法。实际表明,该方法在训练中仅需较伴随灵敏度方法少一个数量级的模型评估次数,即可达到与之相当的性能。