The operation of future 6th-generation (6G) mobile networks will increasingly rely on the ability of deep reinforcement learning (DRL) to optimize network decisions in real-time. DRL yields demonstrated efficacy in various resource allocation problems, such as joint decisions on user scheduling and antenna allocation or simultaneous control of computing resources and modulation. However, trained DRL agents are closed-boxes and inherently difficult to explain, which hinders their adoption in production settings. In this paper, we make a step towards removing this critical barrier by presenting SymbXRL, a novel technique for explainable reinforcement learning (XRL) that synthesizes human-interpretable explanations for DRL agents. SymbXRL leverages symbolic AI to produce explanations where key concepts and their relationships are described via intuitive symbols and rules; coupling such a representation with logical reasoning exposes the decision process of DRL agents and offers more comprehensible descriptions of their behaviors compared to existing approaches. We validate SymbXRL in practical network management use cases supported by DRL, proving that it not only improves the semantics of the explanations but also paves the way for explicit agent control: for instance, it enables intent-based programmatic action steering that improves by 12% the median cumulative reward over a pure DRL solution.
翻译:未来第六代(6G)移动网络的运营将日益依赖于深度强化学习(DRL)实时优化网络决策的能力。DRL已在多种资源分配问题中展现出显著效能,例如用户调度与天线分配的联合决策,或计算资源与调制的同步控制。然而,训练完成的DRL智能体如同黑箱,本质上难以解释,这阻碍了其在实际生产环境中的部署。本文通过提出SymbXRL——一种用于可解释强化学习(XRL)的新型技术,旨在为DRL智能体生成人类可理解的解释,从而朝着消除这一关键障碍迈出一步。SymbXRL利用符号化人工智能生成解释,其中关键概念及其关系通过直观的符号与规则进行描述;将此类表示与逻辑推理相结合,能够揭示DRL智能体的决策过程,并提供相较于现有方法更易于理解的行为描述。我们在DRL支持的实际网络管理用例中验证了SymbXRL,证明其不仅提升了解释的语义清晰度,还为显式的智能体控制开辟了道路:例如,它支持基于意图的程序化行动引导,相较于纯DRL解决方案,其中位数累积奖励提升了12%。