In unseen and complex outdoor environments, collision avoidance navigation for unmanned aerial vehicle (UAV) swarms presents a challenging problem. It requires UAVs to navigate through various obstacles and complex backgrounds. Existing collision avoidance navigation methods based on deep reinforcement learning show promising performance but suffer from poor generalization abilities, resulting in performance degradation in unseen environments. To address this issue, we investigate the cause of weak generalization ability in DRL and propose a novel causal feature selection module. This module can be integrated into the policy network and effectively filters out non-causal factors in representations, thereby reducing the influence of spurious correlations between non-causal factors and action predictions. Experimental results demonstrate that our proposed method can achieve robust navigation performance and effective collision avoidance especially in scenarios with unseen backgrounds and obstacles, which significantly outperforms existing state-of-the-art algorithms.
翻译:在未知且复杂的户外环境中,无人机集群的避障导航是一个具有挑战性的问题。它要求无人机能够穿越各种障碍物和复杂背景。现有的基于深度强化学习的避障导航方法虽然表现出良好的性能,但其泛化能力较差,导致在未知环境中性能下降。为解决这一问题,我们研究了深度强化学习中泛化能力弱的原因,并提出了一种新颖的因果特征选择模块。该模块可以集成到策略网络中,有效过滤掉表征中的非因果因素,从而减少非因果因素与动作预测之间虚假相关性的影响。实验结果表明,我们提出的方法能够实现鲁棒的导航性能和有效的避障,尤其是在面对未知背景和障碍物的场景中,其性能显著优于现有的最先进算法。