Sixth-generation (6G) radio access networks (RANs) must enforce strict service-level agreements (SLAs) for heterogeneous slices, yet sudden latency spikes remain difficult to diagnose and resolve with conventional deep reinforcement learning (DRL) or explainable RL (XRL). We propose \emph{Attention-Enhanced Multi-Agent Proximal Policy Optimization (AE-MAPPO)}, which integrates six specialized attention mechanisms into multi-agent slice control and surfaces them as zero-cost, faithful explanations. The framework operates across O-RAN timescales with a three-phase strategy: predictive, reactive, and inter-slice optimization. A URLLC case study shows AE-MAPPO resolves a latency spike in $18$ms, restores latency to $0.98$ms with $99.9999\%$ reliability, and reduces troubleshooting time by $93\%$ while maintaining eMBB and mMTC continuity. These results confirm AE-MAPPO's ability to combine SLA compliance with inherent interpretability, enabling trustworthy and real-time automation for 6G RAN slicing.
翻译:第六代(6G)无线接入网(RAN)必须为异构网络切片强制执行严格的服务水平协议(SLA),然而传统深度强化学习(DRL)或可解释强化学习(XRL)仍难以诊断和解决突发的延迟尖峰问题。我们提出**注意力增强的多智能体近端策略优化(AE-MAPPO)**,该方法将六种专用注意力机制集成到多智能体切片控制中,并将其呈现为零成本、高保真的解释依据。该框架在O-RAN时间尺度上运行,采用三阶段策略:预测性优化、反应性优化和切片间优化。一项URLLC案例研究表明,AE-MAPPO可在$18$ms内消除延迟尖峰,将延迟恢复至$0.98$ms并实现$99.9999\%$的可靠性,同时将故障排除时间减少$93\%$,并保持eMBB和mMTC业务的连续性。这些结果证实了AE-MAPPO能够将SLA合规性与内在可解释性相结合,为6G RAN切片提供可信赖的实时自动化解决方案。