Deep reinforcement learning (DRL) promises adaptive control for future mobile networks but conventional agents remain reactive: they act on past and current measurements and cannot leverage short-term forecasts of exogenous KPIs such as bandwidth. Augmenting agents with predictions can overcome this temporal myopia, yet uptake in networking is scarce because forecast-aware agents act as closed-boxes; operators cannot tell whether predictions guide decisions or justify the added complexity. We propose SIA, the first interpreter that exposes in real time how forecast-augmented DRL agents operate. SIA fuses Symbolic AI abstractions with per-KPI Knowledge Graphs to produce explanations, and includes a new Influence Score metric. SIA achieves sub-millisecond speed, over 200x faster than existing XAI methods. We evaluate SIA on three diverse networking use cases, uncovering hidden issues, including temporal misalignment in forecast integration and reward-design biases that trigger counter-productive policies. These insights enable targeted fixes: a redesigned agent achieves a 9% higher average bitrate in video streaming, and SIA's online Action-Refinement module improves RAN-slicing reward by 25% without retraining. By making anticipatory DRL transparent and tunable, SIA lowers the barrier to proactive control in next-generation mobile networks.
翻译:深度强化学习(DRL)为未来移动网络的自适应控制带来了希望,但传统智能体仍具有反应性:它们基于过去和当前的测量结果采取行动,无法利用诸如带宽等外生关键绩效指标的短期预测。通过预测增强智能体可以克服这种时间短视,但在网络领域的应用仍然稀少,因为具备预测感知能力的智能体如同黑箱;运营商无法判断预测是否真正指导决策,或是否值得为此增加复杂性。我们提出了SIA,这是首个能够实时揭示预测增强型DRL智能体如何运作的解释器。SIA融合了符号人工智能的抽象表示与针对每个关键绩效指标的知识图谱以生成解释,并包含一种新的影响力评分指标。SIA实现了亚毫秒级的速度,比现有的可解释人工智能方法快200倍以上。我们在三个不同的网络应用场景中评估SIA,揭示了隐藏的问题,包括预测集成中的时间错位以及触发适得其反策略的奖励设计偏差。这些见解使得针对性修复成为可能:重新设计的智能体在视频流传输中实现了平均比特率提升9%,而SIA的在线动作优化模块无需重新训练即可将无线接入网切片奖励提高25%。通过使预见性DRL变得透明且可调,SIA降低了在下一代移动网络中实现主动控制的障碍。