Shared software datapaths underpin modern datacentre networking. They implement mechanisms such as virtual switching, network virtualisation tunneling, or reliable transport, and enforce policies, such as tenant rate limits, virtual network isolation, or congestion control. However, because multiple applications, containers, or VMs share them, often across tenants, they pose a tail latency isolation challenge. Current isolation approaches either sacrifice efficiency via coarse-grained core partitioning or provide weak tail latency isolation when sharing cores with basic rate limits. This paper presents Virtuoso, a time protection mechanism for shared software datapaths that provides strong cross-tenant tail latency isolation while preserving low overhead and microsecond-scale latency. Our key insight is that tail latency is fundamentally a time metric, so byte or packet throughput is the wrong metric for controlling interference when packet processing costs vary. Our design instead enforces isolation through per-tenant CPU-time budgets at datapath intervention points within run-to-completion loops, without relying on preemption. In a case study, we instantiate Virtuoso in the TAS TCP stack and demonstrate a 7.8X reduction in victim tail latency under adversarial interference while keeping throughput within 5% of unmodified TAS. We also observe a 3X per-core efficiency improvement compared to siloed datapaths under bursty workloads.
翻译:共享软件数据路径是现代数据中心网络的基础。它们实现了虚拟交换、网络虚拟化隧道或可靠传输等机制,并强制执行租户速率限制、虚拟网络隔离或拥塞控制等策略。然而,由于多个应用、容器或虚拟机经常跨租户共享这些路径,因此带来了尾部延迟隔离的挑战。当前的隔离方法要么通过粗粒度的核心分区牺牲效率,要么在通过基本速率限制共享核心时提供较弱的尾部延迟隔离。本文提出了Virtuoso,一种针对共享软件数据路径的时间保护机制,它在保持低开销和微秒级延迟的同时,提供了强大的跨租户尾部延迟隔离。我们的关键洞察是尾部延迟本质上是一个时间度量,因此当数据包处理成本变化时,字节或数据包吞吐量是用于控制干扰的错误度量指标。相反,我们的设计通过在运行至完成循环内的数据路径干预点对每个租户的CPU时间预算实施隔离,而不依赖抢占。在一个案例研究中,我们在TAS TCP协议栈中实例化了Virtuoso,并在对抗性干扰下将受害者的尾部延迟降低了7.8倍,同时保持吞吐量在未修改TAS的5%以内。我们还观察到,在突发工作负载下,与隔离的数据路径相比,每核心效率提高了3倍。