While reinforcement learning algorithms have had great success in the field of autonomous navigation, they cannot be straightforwardly applied to the real autonomous systems without considering the safety constraints. The later are crucial to avoid unsafe behaviors of the autonomous vehicle on the road. To highlight the importance of these constraints, in this study, we compare two learnable navigation policies: safe and unsafe. The safe policy takes the constraints into account, while the other does not. We show that the safe policy is able to generate trajectories with more clearance (distance to the obstacles) and makes less collisions while training without sacrificing the overall performance.
翻译:尽管强化学习算法在自主导航领域取得了巨大成功,但若不考虑安全约束,它们无法直接应用于真实的自主系统。这些约束对于避免自动驾驶车辆在道路上出现不安全行为至关重要。为突出这些约束的重要性,本研究比较了两种可学习的导航策略:安全策略与不安全策略。安全策略考虑了约束条件,而另一种策略则未予考虑。我们证明,安全策略能够在不牺牲整体性能的情况下,生成具有更大安全间距(与障碍物的距离)的轨迹,并在训练过程中减少碰撞次数。