Autonomous navigation in marine environments can be extremely challenging, especially in the presence of spatially varying flow disturbances and dynamic and static obstacles. In this work, we demonstrate that incorporating local flow field measurements fundamentally alters the nature of the problem, transforming otherwise unsolvable navigation scenarios into tractable ones. However, the mere availability of flow data is not sufficient; it must be effectively fused with conventional sensory inputs such as ego-state and obstacle states. To this end, we propose \textbf{MarineFormer}, a Transformer-based policy architecture that integrates two complementary attention mechanisms: spatial attention for sensor fusion, and temporal attention for capturing environmental dynamics. MarineFormer is trained end-to-end via reinforcement learning in a 2D simulated environment with realistic flow features and obstacles. Extensive evaluations against classical and state-of-the-art baselines show that our approach improves episode completion success rate by nearly 23\% while reducing path length. Ablation studies further highlight the critical role of flow measurements and the effectiveness of our proposed architecture in leveraging them.
翻译:海洋环境中的自主导航极具挑战性,尤其是在存在空间变化的流场扰动以及动态与静态障碍物的情况下。本研究表明,引入局部流场测量从根本上改变了问题的性质,将原本无法解决的导航场景转化为可处理的问题。然而,仅拥有流场数据并不足够;必须将其与自我状态、障碍物状态等传统传感输入进行有效融合。为此,我们提出 \textbf{MarineFormer},一种基于 Transformer 的策略架构,它整合了两种互补的注意力机制:用于传感器融合的空间注意力,以及用于捕捉环境动态的时间注意力。MarineFormer 在具备真实流场特征与障碍物的二维仿真环境中,通过强化学习进行端到端训练。针对经典方法与前沿基线的广泛评估表明,我们的方法在降低路径长度的同时,将航次完成成功率提升了近 23\%。消融实验进一步凸显了流场测量的关键作用,以及我们所提架构在利用这些数据方面的有效性。