The fifth generation (5G) of wireless networks is set out to meet the stringent requirements of vehicular use cases. Edge computing resources can aid in this direction by moving processing closer to end-users, reducing latency. However, given the stochastic nature of traffic loads and availability of physical resources, appropriate auto-scaling mechanisms need to be employed to support cost-efficient and performant services. To this end, we employ Deep Reinforcement Learning (DRL) for vertical scaling in Edge computing to support vehicular-to-network communications. We address the problem using Deep Deterministic Policy Gradient (DDPG). As DDPG is a model-free off-policy algorithm for learning continuous actions, we introduce a discretization approach to support discrete scaling actions. Thus we address scalability problems inherent to high-dimensional discrete action spaces. Employing a real-world vehicular trace data set, we show that DDPG outperforms existing solutions, reducing (at minimum) the average number of active CPUs by 23% while increasing the long-term reward by 24%.
翻译:第五代(5G)无线网络旨在满足车联网应用场景的严苛需求。边缘计算资源可通过将处理过程迁移至靠近最终用户的位置来降低延迟,从而助力实现这一目标。然而,鉴于流量负荷的随机特性与物理资源的可用性,需部署适当的自动扩展机制以支持高性价比且高性能的服务。为此,我们采用深度强化学习(DRL)实现边缘计算中的垂直扩展,以支持车辆到网络(V2N)通信。我们使用深度确定性策略梯度(DDPG)算法解决该问题,DDPG是一种用于学习连续动作的无模型离策略算法,我们引入了一种离散化方法来支持离散扩展动作,从而解决了高维离散动作空间固有的可扩展性问题。基于真实的车联网轨迹数据集,我们证明DDPG优于现有解决方案,其至少可将平均活跃CPU数量降低23%,同时使长期奖励提升24%。