In the era of AI, neural networks have become increasingly popular for modeling, inference, and prediction, largely due to their potential for universal approximation. With the proliferation of such deep learning models, a question arises: are leaner statistical methods still relevant? To shed insight on this question, we employ the mechanistic nonlinear ordinary differential equation (ODE) inverse problem as a testbed, using the physics-informed neural network (PINN) as a representative of the deep learning paradigm and manifold-constrained Gaussian process inference (MAGI) as a representative of statistically principled methods. Through case studies involving the SEIR model from epidemiology and the Lorenz model from chaotic dynamics, we demonstrate that statistical methods are far from obsolete, especially when working with sparse and noisy observations. On tasks such as parameter inference and trajectory reconstruction, statistically principled methods consistently achieve lower bias and variance, while using far fewer parameters and requiring less hyperparameter tuning. Statistical methods can also decisively outperform deep learning models on out-of-sample future prediction, where the absence of relevant data often leads overparameterized models astray. Additionally, we find that statistically principled approaches are more robust to accumulation of numerical imprecision and can represent the underlying system more faithfully to the true governing ODEs.
翻译:在人工智能时代,神经网络因其普适逼近潜力而在建模、推断和预测领域日益普及。随着此类深度学习模型的激增,一个关键问题随之产生:更为精简的统计方法是否仍然具有价值?为深入探讨该问题,本研究以机制性非线性常微分方程(ODE)反问题为实验平台,采用物理信息神经网络(PINN)作为深度学习范式的代表,并以流形约束高斯过程推断(MAGI)作为统计原理化方法的代表。通过流行病学中的SEIR模型与混沌动力学中的Lorenz模型等案例研究,我们证明统计方法远未过时,尤其是在处理稀疏且含噪声的观测数据时。在参数推断与轨迹重建等任务中,基于统计原理的方法始终能实现更低的偏差与方差,同时使用更少的参数且无需大量超参数调优。在样本外未来预测任务中,统计方法亦能显著超越深度学习模型——当相关数据缺失时,过度参数化的模型往往会产生严重偏差。此外,我们发现基于统计原理的方法对数值误差累积具有更强的鲁棒性,且能更忠实地反映真实控制常微分方程所描述的基础系统。