As a representative continuous-depth neural network approach, stochastic differential equation (SDE)-based Bayesian neural networks (BNNs) have attracted considerable attention due to their solid theoretical foundations and strong potential for real-world applications. However, their reliance on numerical SDE solvers inevitably incurs a large number of function evaluations (NFEs), resulting in high computational cost and occasional convergence instability. To address these challenges, we propose a Nesterov-accelerated gradient (NAG) enhanced SDE-BNN model. By integrating NAG into the SDE-BNN framework along with an NFE-dependent residual skip connection, our method accelerates convergence and substantially reduces NFEs during both training and testing. Extensive empirical results show that our model consistently outperforms conventional SDE-BNNs across various tasks, including image classification and sequence modeling, achieving lower NFEs and improved predictive accuracy.
翻译:作为连续深度神经网络的一种代表性方法,基于随机微分方程的贝叶斯神经网络因其坚实的理论基础和在实际应用中的巨大潜力而备受关注。然而,这类方法对数值SDE求解器的依赖不可避免地导致大量函数计算,从而带来高昂的计算成本以及偶尔的收敛不稳定性。为了解决这些挑战,我们提出了一种涅斯捷罗夫加速梯度增强的SDE-BNN模型。通过将NAG整合到SDE-BNN框架中,并结合与NFE相关的残差跳跃连接,我们的方法加速了收敛,并在训练和测试阶段显著减少了函数计算次数。大量的实证结果表明,我们的模型在图像分类和序列建模等多种任务上始终优于传统的SDE-BNN,实现了更低的NFE和改进的预测精度。