Deep reinforcement learning has achieved significant results in low-level controlling tasks. However, for some applications like autonomous driving and drone flying, it is difficult to control behavior stably since the agent may suddenly change its actions which often lowers the controlling system's efficiency, induces excessive mechanical wear, and causes uncontrollable, dangerous behavior to the vehicle. Recently, a method called conditioning for action policy smoothness (CAPS) was proposed to solve the problem of jerkiness in low-dimensional features for applications such as quadrotor drones. To cope with high-dimensional features, this paper proposes image-based regularization for action smoothness (I-RAS) for solving jerky control in autonomous miniature car racing. We also introduce a control based on impact ratio, an adaptive regularization weight to control the smoothness constraint, called IR control. In the experiment, an agent with I-RAS and IR control significantly improves the success rate from 59% to 95%. In the real-world-track experiment, the agent also outperforms other methods, namely reducing the average finish lap time, while also improving the completion rate even without real world training. This is also justified by an agent based on I-RAS winning the 2022 AWS DeepRacer Final Championship Cup.
翻译:深度强化学习在低层级控制任务中取得了显著成果。然而,在自动驾驶和无人机飞行等应用中,由于智能体可能突然改变动作,导致控制系统效率降低、机械磨损加剧,并引发车辆不可控的危险行为,因此难以实现稳定控制。近年来,针对四旋翼无人机等低维特征应用中的动作顿挫问题,学者提出了动作策略平滑条件化方法。为应对高维特征,本文提出基于图像的动作平滑正则化,用于解决自主微缩赛车比赛中的动作抖动控制问题。我们还引入基于冲击比的控制方法,这是一种自适应正则化权重,用于控制平滑约束,称为冲击比控制。实验表明,采用基于图像的动作平滑正则化和冲击比控制的智能体将成功率从59%显著提升至95%。在真实赛道实验中,该智能体在未经过真实世界训练的情况下,不仅降低了平均完赛圈时,还提高了完赛率,性能优于其他方法。基于图像的动作平滑正则化的智能体在2022年AWS DeepRacer总决赛中夺冠也验证了这一点。