Memory dominates datacenter system cost and power. Memory expansion via Compute Express Link (CXL) is an effective way to provide additional memory at lower cost and power, but its effective use requires software-level tiering for hyperscaler workloads. Existing tiering solutions, including current Linux support, face fundamental limitations in production deployments. First, they lack multi-tenancy support, failing to handle stacked homogeneous or heterogeneous workloads. Second, limited control-plane flexibility leads to fairness violations and performance variability. Finally, insufficient observability prevents operators from diagnosing performance pathologies at scale. We present Equilibria, an OS framework enabling fair, multi-tenant CXL tiering at datacenter scale. Equilibria provides per-container controls for memory fair-share allocation and fine-grained observability of tiered-memory usage and operations. It further enforces flexible, user-specified fairness policies through regulated promotion and demotion, and mitigates noisy-neighbor interference by suppressing thrashing. Evaluated in a large hyperscaler fleet using production workloads and benchmarks, Equilibria helps workloads meet service level objectives (SLOs) while avoiding performance interference. It improves performance over the state-of-the-art Linux solution, TPP, by up to 52% for production workloads and 1.7x for benchmarks. All Equilibria patches have been released to the Linux community.
翻译:内存是数据中心系统成本与功耗的主要来源。通过计算快速链接(CXL)扩展内存是以更低成本与功耗提供额外内存的有效途径,但其高效利用需要针对超大规模工作负载进行软件层面的分层。现有分层方案(包括当前Linux内核支持)在生产部署中面临根本性限制。首先,它们缺乏多租户支持,无法处理堆叠的同构或异构工作负载。其次,有限的控制平面灵活性导致公平性违规与性能波动。最后,不足的可观测性使运维人员难以在大规模场景下诊断性能异常。本文提出Equilibria——一种能在数据中心规模实现公平多租户CXL内存分层的操作系统框架。Equilibria提供面向容器的内存公平份额分配控制机制,以及对分层内存使用与操作的细粒度可观测性。它进一步通过受调控的提升与降级机制执行灵活的用户自定义公平策略,并通过抑制抖动缓解“吵闹邻居”干扰。在采用生产工作负载与基准测试的大型超规模集群中评估表明,Equilibria能帮助工作负载满足服务水平目标(SLO),同时避免性能干扰。相较于最先进的Linux解决方案TPP,Equilibria将生产工作负载性能提升最高达52%,基准测试性能提升达1.7倍。所有Equilibria补丁均已向Linux社区开源发布。