This paper proposes an Improved Noisy Deep Q-Network (Noisy DQN) to enhance the exploration and stability of Unmanned Aerial Vehicle (UAV) when applying deep reinforcement learning in simulated environments. This method enhances the exploration ability by combining the residual NoisyLinear layer with an adaptive noise scheduling mechanism, while improving training stability through smooth loss and soft target network updates. Experiments show that the proposed model achieves faster convergence and up to $+40$ higher rewards compared to standard DQN and quickly reach to the minimum number of steps required for the task 28 in the 15 * 15 grid navigation environment set up. The results show that our comprehensive improvements to the network structure of NoisyNet, exploration control, and training stability contribute to enhancing the efficiency and reliability of deep Q-learning.
翻译:本文提出一种改进的噪声深度Q网络(Noisy DQN),用于增强无人机在模拟环境中应用深度强化学习时的探索能力与稳定性。该方法通过将残差噪声线性层与自适应噪声调度机制相结合来提升探索能力,同时通过平滑损失函数和柔性目标网络更新来提高训练稳定性。实验表明,在15*15网格导航环境设定中,所提模型相比标准DQN实现了更快的收敛速度,奖励值最高提升$+40$,并能快速达到任务所需的最小步数28。结果表明,我们对NoisyNet网络结构、探索控制及训练稳定性的综合改进,有助于提升深度Q学习的效率与可靠性。