Recently, there has been growing interest in autonomous shipping due to its potential to improve maritime efficiency and safety. The use of advanced technologies, such as artificial intelligence, can address the current navigational and operational challenges in autonomous shipping. In particular, inland waterway transport (IWT) presents a unique set of challenges, such as crowded waterways and variable environmental conditions. In such dynamic settings, the reliability and robustness of autonomous shipping solutions are critical factors for ensuring safe operations. This paper examines the robustness of benchmark deep reinforcement learning (RL) algorithms, implemented for IWT within an autonomous shipping simulator, and their ability to generate effective motion planning policies. We demonstrate that a model-free approach can achieve an adequate policy in the simulator, successfully navigating port environments never encountered during training. We focus particularly on Soft-Actor Critic (SAC), which we show to be inherently more robust to environmental disturbances compared to MuZero, a state-of-the-art model-based RL algorithm. In this paper, we take a significant step towards developing robust, applied RL frameworks that can be generalized to various vessel types and navigate complex port- and inland environments and scenarios.
翻译:近年来,自主航运因其提升海上效率与安全性的潜力而受到广泛关注。人工智能等先进技术的应用能够解决当前自主航运面临的导航与操作挑战。特别是内河运输领域存在一系列独特挑战,例如航道拥挤和环境条件多变。在此类动态场景中,自主航运解决方案的可靠性与鲁棒性成为确保安全运行的关键因素。本文通过自主航运仿真器,研究了针对内河运输实现的基准深度强化学习算法及其生成有效运动规划策略的能力。我们证明无模型方法能够在仿真器中获得充分有效的策略,成功导航训练过程中从未遭遇的港口环境。研究特别聚焦于Soft-Actor Critic算法,相较于最先进的基于模型强化学习算法MuZero,我们证明该算法对环境干扰具有更强的内在鲁棒性。本文为开发能够泛化至多种船舶类型、适应复杂港口与内河环境场景的鲁棒应用强化学习框架迈出了重要一步。