Neural ODEs are a recently developed model class that combine the strong model priors of differential equations with the high-capacity function approximation of neural networks. One advantage of Neural ODEs is the potential for memory-efficient training via the continuous adjoint method. However, memory-efficient training comes at the cost of approximate gradients. Therefore, in practice, gradients are often obtained by simply backpropagating through the internal operations of the forward ODE solve - incurring high memory cost. Interestingly, it is possible to construct algebraically reversible ODE solvers that allow for both exact gradients and the memory-efficiency of the continuous adjoint method. Unfortunately, current reversible solvers are low-order and suffer from poor numerical stability. The use of these methods in practice is therefore limited. In this work, we present a class of algebraically reversible solvers that are both high-order and numerically stable. Moreover, any explicit numerical scheme can be made reversible by our method. This construction naturally extends to numerical schemes for Neural CDEs and SDEs.
翻译:神经常微分方程是近期发展的模型类别,它将微分方程的强模型先验与神经网络的高容量函数逼近能力相结合。神经常微分方程的一个优势在于可通过连续伴随方法实现内存高效的训练。然而,内存高效训练以梯度近似为代价。因此在实际应用中,梯度通常通过直接对前向常微分方程求解的内部操作进行反向传播获得——这会带来高昂的内存开销。有趣的是,可以构建代数可逆的常微分方程求解器,使其既能获得精确梯度,又能保持连续伴随方法的内存效率。遗憾的是,现有的可逆求解器阶数较低且数值稳定性较差,因此在实际应用中的使用受到限制。本研究提出了一类兼具高阶特性与数值稳定性的代数可逆求解器。此外,任何显式数值格式均可通过我们的方法实现可逆化。该构建方法可自然扩展到神经随机微分方程与随机微分方程的数值格式中。