Deep reinforcement learning (DRL) is playing an increasingly important role in real-world applications. However, obtaining an optimally performing DRL agent for complex tasks, especially with sparse rewards, remains a significant challenge. The training of a DRL agent can be often trapped in a bottleneck without further progress. In this paper, we propose RICE, an innovative refining scheme for reinforcement learning that incorporates explanation methods to break through the training bottlenecks. The high-level idea of RICE is to construct a new initial state distribution that combines both the default initial states and critical states identified through explanation methods, thereby encouraging the agent to explore from the mixed initial states. Through careful design, we can theoretically guarantee that our refining scheme has a tighter sub-optimality bound. We evaluate RICE in various popular RL environments and real-world applications. The results demonstrate that RICE significantly outperforms existing refining schemes in enhancing agent performance.
翻译:深度强化学习(DRL)在现实应用中正扮演着日益重要的角色。然而,针对复杂任务(尤其是奖励稀疏的任务)训练出性能最优的DRL智能体仍是一项重大挑战。DRL智能体的训练常常会陷入瓶颈而无法取得进一步进展。本文提出RICE,一种创新的强化学习精炼方案,它融合了解释方法以突破训练瓶颈。RICE的核心思想是构建一个新的初始状态分布,该分布结合了默认初始状态以及通过解释方法识别的关键状态,从而激励智能体从混合初始状态开始探索。通过精心设计,我们能够从理论上保证所提精炼方案具有更紧的次优性界。我们在多种流行的强化学习环境和实际应用中对RICE进行了评估。结果表明,RICE在提升智能体性能方面显著优于现有的精炼方案。