Stabilizing Backpropagation Through Time to Learn Complex Physics

Of all the vector fields surrounding the minima of recurrent learning setups, the gradient field with its exploding and vanishing updates appears a poor choice for optimization, offering little beyond efficient computability. We seek to improve this suboptimal practice in the context of physics simulations, where backpropagating feedback through many unrolled time steps is considered crucial to acquiring temporally coherent behavior. The alternative vector field we propose follows from two principles: physics simulators, unlike neural networks, have a balanced gradient flow, and certain modifications to the backpropagation pass leave the positions of the original minima unchanged. As any modification of backpropagation decouples forward and backward pass, the rotation-free character of the gradient field is lost. Therefore, we discuss the negative implications of using such a rotational vector field for optimization and how to counteract them. Our final procedure is easily implementable via a sequence of gradient stopping and component-wise comparison operations, which do not negatively affect scalability. Our experiments on three control problems show that especially as we increase the complexity of each task, the unbalanced updates from the gradient can no longer provide the precise control signals necessary while our method still solves the tasks. Our code can be found at https://github.com/tum-pbs/StableBPTT.

翻译：在所有围绕循环学习设置最小值的向量场中，梯度场因其爆炸和消失的更新而成为优化的不佳选择，除了高效的可计算性外几乎毫无优势。我们试图在物理模拟的背景下改进这种次优实践，其中通过许多展开的时间步反向传播反馈被认为对获取时间一致行为至关重要。我们提出的替代向量场遵循两个原则：与神经网络不同，物理模拟器具有平衡的梯度流；对反向传播过程的某些修改会保留原始最小值的位点。由于对反向传播的任何修改都会解耦前向和反向传播，梯度场的无旋转特性因而丧失。因此，我们讨论了使用此类旋转向量场进行优化的负面后果，以及如何抵消这些影响。我们的最终程序可通过一系列梯度停止和逐分量比较操作轻松实现，这些操作不会对可扩展性产生负面影响。我们在三个控制问题上的实验表明，尤其是随着每个任务复杂性的增加，梯度产生的不平衡更新不再能提供必要的精确控制信号，而我们的方法仍能解决这些任务。我们的代码可在 https://github.com/tum-pbs/StableBPTT 找到。

相关内容

反向传播

关注 354

反向传播一词严格来说仅指用于计算梯度的算法，而不是指如何使用梯度。但是该术语通常被宽松地指整个学习算法，包括如何使用梯度，例如通过随机梯度下降。反向传播将增量计算概括为增量规则中的增量规则，该规则是反向传播的单层版本，然后通过自动微分进行广义化，其中反向传播是反向累积（或“反向模式”）的特例。在机器学习中，反向传播（backprop）是一种广泛用于训练前馈神经网络以进行监督学习的算法。对于其他人工神经网络（ANN）都存在反向传播的一般化–一类算法，通常称为“反向传播”。反向传播算法的工作原理是，通过链规则计算损失函数相对于每个权重的梯度，一次计算一层，从最后一层开始向后迭代，以避免链规则中中间项的冗余计算。

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日