Neural Ordinary Differential Equations (NODEs), a framework of continuous-depth neural networks, have been widely applied, showing exceptional efficacy in coping with representative datasets. Recently, an augmented framework has been developed to overcome some limitations that emerged in the application of the original framework. In this paper, we propose a new class of continuous-depth neural networks with delay, named Neural Delay Differential Equations (NDDEs). To compute the corresponding gradients, we use the adjoint sensitivity method to obtain the delayed dynamics of the adjoint. Differential equations with delays are typically seen as dynamical systems of infinite dimension that possess more fruitful dynamics. Compared to NODEs, NDDEs have a stronger capacity of nonlinear representations. We use several illustrative examples to demonstrate this outstanding capacity. Firstly, we successfully model the delayed dynamics where the trajectories in the lower-dimensional phase space could be mutually intersected and even chaotic in a model-free or model-based manner. Traditional NODEs, without any argumentation, are not directly applicable for such modeling. Secondly, we achieve lower loss and higher accuracy not only for the data produced synthetically by complex models but also for the CIFAR10, a well-known image dataset. Our results on the NDDEs demonstrate that appropriately articulating the elements of dynamical systems into the network design is truly beneficial in promoting network performance.
翻译:神经常微分方程(NODEs)作为一种连续深度神经网络框架已被广泛应用,在处理代表性数据集时展现出卓越效能。近期,为克服原始框架应用中出现的若干局限性,研究人员开发了增强型框架。本文提出一类新型连续深度延迟神经网络——神经延迟微分方程(NDDEs)。为计算相应梯度,我们采用伴随灵敏度方法获得伴随变量的延迟动力学。延迟微分方程通常被视为具有更丰富动力学的无限维动力系统。与NODEs相比,NDDEs具有更强的非线性表征能力。我们通过若干示例验证了这一卓越能力:首先,成功实现了延迟动力学建模,使低维相空间中的轨迹能够以无模型或基于模型的方式相互交叉甚至混沌——传统NODEs即便经过论据扩充也无法直接应用于此类建模;其次,不仅在复杂模型生成的合成数据上,而且在著名图像数据集CIFAR10上均实现了更低的损失与更高的精度。我们在NDDEs上的结果表明,将动力系统要素恰当融入网络设计确实有益于提升网络性能。