The fifth generation (5G) of wireless networks is set out to meet the stringent requirements of vehicular use cases. Edge computing resources can aid in this direction by moving processing closer to end-users, reducing latency. However, given the stochastic nature of traffic loads and availability of physical resources, appropriate auto-scaling mechanisms need to be employed to support cost-efficient and performant services. To this end, we employ Deep Reinforcement Learning (DRL) for vertical scaling in Edge computing to support vehicular-to-network communications. We address the problem using Deep Deterministic Policy Gradient (DDPG). As DDPG is a model-free off-policy algorithm for learning continuous actions, we introduce a discretization approach to support discrete scaling actions. Thus we address scalability problems inherent to high-dimensional discrete action spaces. Employing a real-world vehicular trace data set, we show that DDPG outperforms existing solutions, reducing (at minimum) the average number of active CPUs by 23% while increasing the long-term reward by 24%.
翻译:第五代(5G)无线网络旨在满足车联网用例的严苛需求。边缘计算资源通过将处理任务迁移至靠近终端用户的位置,可有效降低网络延迟。然而,由于交通负载的随机性及物理资源的可用性波动,需要部署合适的自动伸缩机制以支撑兼具成本效益与高性能的服务。为此,我们采用深度强化学习(DRL)技术实现边缘计算环境下的垂直伸缩,以支持车辆到网络(V2N)通信。我们基于深度确定性策略梯度(DDPG)算法解决该问题。鉴于DDPG是一种用于学习连续动作的无模型离策略算法,我们引入离散化方法以支持离散伸缩动作,从而解决高维离散动作空间固有的可扩展性问题。基于真实车联网轨迹数据集,我们证明DDPG算法优于现有解决方案:平均活跃CPU数量至少降低23%,长期累积奖励提升24%。