Network Function Virtualization (NFV) seeks to replace hardware middleboxes with software-based Network Functions (NFs). NFV systems are seeing greater deployment in the cloud and at the edge. However, especially at the edge, there is a mismatch between the traditional focus on NFV throughput and the need to meet very low latency SLOs, as edge services inherently require low latency. Moreover, cloud-based NFV systems need to achieve such low latency while minimizing CPU core usage. We find that real-world traffic exhibits burstiness that causes latency spikes of up to 10s of milliseconds in existing NFV systems. To address this, we built NetBlaze, which achieves sub-millisecond p99 latency SLOs, even for adversarial traffic, using a novel multi-scale core-scaling strategy. NetBlaze makes traffic-to-core allocation decisions at rack, server, and core-spatial scales, and at increasingly finer timescales, to accommodate multi-timescale bursts. In comparison with state-of-the-art approaches, NetBlaze is the only one capable of achieving sub-millisecond p99 latency SLOs while using a comparable number of cores.
翻译:网络功能虚拟化(NFV)旨在用基于软件的网络功能(NF)替代硬件中间盒。NFV系统在云端和边缘侧正得到更广泛部署。然而,尤其是在边缘场景中,传统对NFV吞吐量的关注与满足极低延迟SLO(服务等级协议)的需求之间存在显著矛盾——边缘服务本质上要求低延迟。此外,云原生NFV系统在实现此类低延迟的同时,还需最小化CPU核心资源消耗。我们发现,真实网络流量存在突发性特征,导致现有NFV系统中出现高达数十毫秒的延迟尖峰。为解决这一问题,我们构建了NetBlaze系统,其采用新颖的多尺度核心扩展策略,即使面对对抗性流量也能实现亚毫秒级p99延迟SLO。NetBlaze在机架、服务器及核心空间尺度上做出流量到核心的分配决策,并以日益精细的时间尺度加以适配,以应对多时间尺度的流量突发。与现有最优方法相比,NetBlaze是唯一能在使用相当数量核心条件下实现亚毫秒级p99延迟SLO的方案。