Backpropagation within neural networks leverages a fundamental element of automatic differentiation, which is referred to as the reverse mode differentiation, or vector Jacobian Product (VJP) or, in the context of differential geometry, known as the pull-back process. The computation of gradient is important as update of neural network parameters is performed using gradient descent method. In this study, we present a genric randomized method, which updates the parameters of neural networks by using directional derivatives of loss functions computed efficiently by using forward mode AD or Jacobian vector Product (JVP). These JVP are computed along the random directions sampled from different probability distributions e.g., Bernoulli, Normal, Wigner, Laplace and Uniform distributions. The computation of gradient is performed during the forward pass of the neural network. We also present a rigorous analysis of the presented methods providing the rate of convergence along with the computational experiments deployed in scientific Machine learning in particular physics-informed neural networks and Deep Operator Networks.
翻译:神经网络中的反向传播利用了自动微分的一个基本元素,即逆向模式微分,也称为向量雅可比积(VJP),在微分几何的语境中被称为拉回过程。梯度的计算至关重要,因为神经网络的参数更新依赖于梯度下降法。在本研究中,我们提出了一种通用的随机方法,该方法通过使用前向模式自动微分或雅可比向量积(JVP)高效计算损失函数的方向导数来更新神经网络参数。这些JVP沿着从不同概率分布(例如伯努利分布、正态分布、维格纳分布、拉普拉斯分布和均匀分布)中采样的随机方向进行计算。梯度的计算在神经网络的前向传播过程中完成。我们还对所提出的方法进行了严格分析,提供了收敛速度,并结合了在科学机器学习(特别是物理信息神经网络和深度算子网络)中部署的计算实验结果。