Identity teacher forcing (ITF) enables stable training of deterministic recurrent surrogates for chaotic dynamical systems and has been highly effective for dynamical systems reconstruction (DSR) with recurrent neural networks (RNNs), including interpretable almost-linear RNNs (AL-RNNs). However, as an intervention-based prediction loss (and thus a generalized Bayes update), teacher forcing need not match the free-running model's marginal likelihood geometry. We compare the objective-induced curvatures of ITF and marginal likelihood in a probabilistic switching augmentation of AL-RNNs, estimating ambiguity-aware observed information via Louis' identity. In the switching setting studied here, conditioning on a single forced regime path (as ITF does) inflates curvature, while marginal likelihood curvature is reduced by a missing-information correction when multiple switching explanations remain plausible. In Lorenz-63 experiments, windowed evidence fine-tuning improves held-out evidence but can degrade dynamical quantities of interest (QoIs) relative to ITF-pretrained models.
翻译:身份教师强制(ITF)能够稳定训练混沌动力系统的确定性递归替代模型,并在基于递归神经网络(RNN)的动力系统重建(DSR)中展现出极高有效性,包括可解释的近乎线性RNN(AL-RNN)。然而,作为一种基于干预的预测损失(即广义贝叶斯更新),教师强制无需匹配自由运行模型的边际似然几何。我们比较了在AL-RNN概率切换增强设定下,ITF与边际似然的目标诱导曲率,并通过路易斯恒等式估计了具有模糊感知的观测信息。在所研究的切换场景中,条件化于单一强制路径(如ITF的做法)会膨胀曲率,而当多种切换解释仍具合理性时,边际似然曲率则通过缺失信息校正得到缩减。在Lorenz-63实验中,窗口化证据微调能提升留出证据,但相对于ITF预训练模型,可能会降低关注的动力学量(QoIs)。