Perturbation and operator adjoint method are used to give the right adjoint form rigourously. From the derivation, we can have following results: 1) The loss gradient is not an ODE, it is an integral and we shows the reason; 2) The traditional adjoint form is not equivalent with the back propagation results. 3) The adjoint operator analysis shows that if and only if the discrete adjoint has the same scheme with the discrete neural ODE, the adjoint form would give the same results as BP does.
翻译:采用摄动法与算子伴随方法严格推导出正确的伴随形式。通过推导可得以下结论:1) 损失梯度并非常微分方程,而是积分形式,并给出其原因;2) 传统伴随形式与反向传播结果不等价;3) 伴随算子分析表明,当且仅当离散伴随格式与离散神经ODE采用相同格式时,伴随形式才能得到与反向传播一致的结果。