Memory dominates datacenter system cost and power. Memory expansion via Compute Express Link (CXL) is an effective way to provide additional memory at lower cost and power, but its effective use requires software-level tiering for hyperscaler workloads. Existing tiering solutions, including current Linux support, face fundamental limitations in production deployments. First, they lack multi-tenancy support, failing to handle stacked homogeneous or heterogeneous workloads. Second, limited control-plane flexibility leads to fairness violations and performance variability. Finally, insufficient observability prevents operators from diagnosing performance pathologies at scale. We present Equilibria, an OS framework enabling fair, multi-tenant CXL tiering at datacenter scale. Equilibria provides per-container controls for memory fair-share allocation and fine-grained observability of tiered-memory usage and operations. It further enforces flexible, user-specified fairness policies through regulated promotion and demotion, and mitigates noisy-neighbor interference by suppressing thrashing. Evaluated in a large hyperscaler fleet using production workloads and benchmarks, Equilibria helps workloads meet service level objectives (SLOs) while avoiding performance interference. It improves performance over the state-of-the-art Linux solution, TPP, by up to 52% for production workloads and 1.7x for benchmarks. All Equilibria patches have been released to the Linux community.
翻译:内存主导着数据中心系统的成本和功耗。通过计算快速链路(CXL)扩展内存,是以更低成本和功耗提供额外内存的有效途径,但其高效利用需要针对超大规模工作负载实现软件级分层。现有分层解决方案(包括当前Linux支持)在生产部署中面临根本性局限。首先,它们缺乏多租户支持,无法处理堆叠的同构或异构工作负载。其次,有限的控制平面灵活性导致公平性违规和性能波动。最后,可观测性不足阻碍了运维人员在规模场景下诊断性能异常。我们提出Equilibria——一个实现数据中心规模公平多租户CXL分层的操作系统框架。Equilibria提供每容器的内存公平分配控制,以及分层内存使用与操作的细粒度可观测性。它通过受控的升降级机制实施用户指定的灵活公平策略,并通过抑制抖动来缓解噪声邻居干扰。在大型超大规模集群中使用生产工作负载与基准测试的评估表明,Equilibria能在避免性能干扰的同时帮助工作负载满足服务等级目标(SLO)。相较于现有最先进的Linux解决方案TPP,Equilibria使生产工作负载性能提升最高达52%,基准测试性能提升达1.7倍。所有Equilibria补丁已向Linux社区发布。