Proactive caching is essential for minimizing latency and improving Quality of Experience (QoE) in multi-server edge networks. Federated Deep Reinforcement Learning (FDRL) is a promising approach for developing cache policies tailored to dynamic content requests. However, FDRL faces challenges such as an expanding caching action space due to increased content numbers and difficulty in adapting global information to heterogeneous edge environments. In this paper, we propose a Personalized Federated Deep Reinforcement Learning framework for Caching, called PF-DRL-Ca, with the aim to maximize system utility while satisfying caching capability constraints. To manage the expanding action space, we employ a new DRL algorithm, Multi-head Deep Q-Network (MH-DQN), which reshapes the action output layers of DQN into a multi-head structure where each head generates a sub-dimensional action. We next integrate the proposed MH-DQN into a personalized federated training framework, employing a layer-wise approach for training to derive a personalized model that can adapt to heterogeneous environments while exploiting the global information to accelerate learning convergence. Our extensive experimental results demonstrate the superiority of MH-DQN over traditional DRL algorithms on a single server, as well as the advantages of the personal federated training architecture compared to other frameworks.
翻译:在多服务器边缘网络中,主动缓存对于最小化延迟和提升体验质量至关重要。联邦深度强化学习是一种为动态内容请求定制缓存策略的有效方法。然而,FDRL面临诸多挑战,例如内容数量增加导致的缓存动作空间膨胀,以及将全局信息适配到异构边缘环境的困难。本文提出了一种用于缓存的个性化联邦深度强化学习框架,称为PF-DRL-Ca,其目标是在满足缓存容量约束的同时最大化系统效用。为应对动作空间的膨胀,我们采用了一种新的DRL算法——多头深度Q网络,该算法将DQN的动作输出层重构为多头结构,其中每个头生成一个子维度的动作。随后,我们将所提出的MH-DQN集成到一个个性化联邦训练框架中,采用分层训练方法,以推导出一个能够适应异构环境、同时利用全局信息加速学习收敛的个性化模型。我们的大量实验结果表明,在单服务器上MH-DQN优于传统DRL算法,并且个性化联邦训练架构相较于其他框架具有显著优势。