To improve the robustness of transformer neural networks used for temporal-dynamics prediction of chaotic systems, we propose a novel attention mechanism called easy attention which we demonstrate in time-series reconstruction and prediction. As a consequence of the fact that self attention only makes useof the inner product of queries and keys, it is demonstrated that the keys, queries and softmax are not necessary for obtaining the attention score required to capture long-term dependencies in temporal sequences. Through implementing singular-value decomposition (SVD) on the softmax attention score, we further observe that the self attention compresses contribution from both queries and keys in the spanned space of the attention score. Therefore, our proposed easy-attention method directly treats the attention scores as learnable parameters. This approach produces excellent results when reconstructing and predicting the temporal dynamics of chaotic systems exhibiting more robustness and less complexity than the self attention or the widely-used long short-term memory (LSTM) network. Our results show great potential for applications in more complex high-dimensional dynamical systems. Keywords: Machine Learning, Transformer, Self Attention, Koopman Operator, Chaotic System.
翻译:为了提高用于混沌系统时间动力学预测的Transformer神经网络的鲁棒性,我们提出了一种名为“简单注意力”的新型注意力机制,并在时间序列重构与预测中进行了验证。由于自注意力仅利用查询和键的内积,我们证明,对于获取时间序列中长程依赖所需的注意力分数,键、查询和softmax并非必要。通过对softmax注意力分数实施奇异值分解(SVD),我们进一步观察到,自注意力在注意力分数张成空间中压缩了来自查询和键的贡献。因此,我们提出的简单注意力方法直接将注意力分数视为可学习参数。该方法在重构和预测混沌系统时间动力学时表现出色,比自注意力或广泛使用的长短期记忆(LSTM)网络具有更强的鲁棒性和更低的复杂度。我们的结果显示出在更复杂的高维动力系统应用中的巨大潜力。关键词:机器学习,Transformer,自注意力,Koopman算子,混沌系统。