Services with distributed and interdependent components are becoming a popular option for harnessing dispersed resources available on cloud and edge networks. However, effective deployment and management of these services, namely service coordination, is a challenging task. Service coordination comprises the placement and scalability of components and scheduling incoming traffic requesting for services between deployed instances. Due to the online nature of the problem and the success of Deep Reinforcement Learning (DRL) methods, previous works considered DRL agents for solving service coordination problems, yet these solutions have to be retrained for every unseen scenario. Other works have tried to tackle this shortcoming by incorporating Graph Neural Networks (GNN) into their solutions, but they often focus on specific aspects (and disregard others) or cannot operate in dynamic and practical situations where there is no labeled dataset and feedback from the network might be delayed. In response to these challenges, we present GSC, a generalizable service coordinator that jointly considers service placement, scaling, and traffic scheduling. GSC can operate in unseen situations without significant performance degradation and outperforms existing state-of-the-art solutions by 40%, as determined by simulating real-world network situations.
翻译:具有分布式且相互依赖组件的服务,正成为利用云端与边缘网络中分散资源的热门选择。然而,这些服务的有效部署与管理(即服务协调)是一项具有挑战性的任务。服务协调包括组件的放置与伸缩,以及在不同部署实例间调度请求服务的传入流量。由于该问题的在线性质以及深度强化学习方法的成功,先前的研究考虑了使用DRL智能体来解决服务协调问题,但这些解决方案需要对每个未见场景进行重新训练。其他研究试图通过将图神经网络融入解决方案来克服这一缺陷,但它们通常侧重于特定方面(而忽略其他方面),或无法在无标记数据集且网络反馈可能存在延迟的动态实际场景中运行。针对这些挑战,我们提出了GSC,一种可泛化的服务协调器,它联合考虑了服务放置、伸缩和流量调度。GSC能在未见场景中运行而性能无明显下降,并通过模拟真实网络场景确定,其性能比现有最优解决方案提升40%。