Transformer models have shown great success in natural language processing; however, their potential remains mostly unexplored for dynamical systems. In this work, we investigate the optimal output estimation problem using transformers, which generate output predictions using all the past ones. Particularly, we train the transformer using various distinct systems and then evaluate the performance on unseen systems with unknown dynamics. Empirically, the trained transformer adapts exceedingly well to different unseen systems and even matches the optimal performance given by the Kalman filter for linear systems. In more complex settings with non-i.i.d. noise, time-varying dynamics, and nonlinear dynamics like a quadrotor system with unknown parameters, transformers also demonstrate promising results. To support our experimental findings, we provide statistical guarantees that quantify the amount of training data required for the transformer to achieve a desired excess risk. Finally, we point out some limitations by identifying two classes of problems that lead to degraded performance, highlighting the need for caution when using transformers for control and estimation.
翻译:Transformer模型在自然语言处理中取得了巨大成功,然而其在动态系统领域的潜力仍鲜有探索。本文研究利用Transformer进行最优输出估计问题,该模型利用所有历史输出生成预测。具体而言,我们使用多种不同系统训练Transformer,然后评估其在未知动态的未见系统上的性能。实验表明,经过训练的Transformer能极好地适应各种未见系统,甚至在线性系统中匹配卡尔曼滤波器给出的最优性能。在更复杂的场景中——如非独立同分布噪声、时变动力学以及含未知参数的四旋翼飞行器系统等非线性动力学——Transformer同样展现出令人期待的结果。为支撑实验发现,我们提供了统计保证,量化了Transformer达到预期超额风险所需训练数据量。最后,我们指出两类导致性能退化的问题,揭示了在控制与估计中使用Transformer需保持谨慎。