Recent Serverless workloads tend to be largescaled/CPU-memory intensive, such as DL, graph applications, that require dynamic memory-to-compute resources provisioning. Meanwhile, recent solutions seek to design page management strategies for multi-tiered memory systems, to efficiently run heavy workloads. Compute Express Link (CXL) is an ideal platform for serverless workloads runtime that offers a holistic memory namespace thanks to its cache coherent feature and large memory capacity. However, naively offloading Serverless applications to CXL brings substantial latencies. In this work, we first quantify CXL impacts on various Serverless applications. Second, we argue the opportunity of provisioning DRAM and CXL in a fine-grained, application-specific manner to Serverless workloads, by creating a shim layer to identify, and naively place hot regions to DRAM, while leaving cold/warm regions to CXL. Based on the observation, we finally propose the prototype of Porter, a middleware in-between modern Serverless architecture and CXL-enabled tiered memory system, to efficiently utilize memory resources, while saving costs.
翻译:近期无服务器工作负载趋向于大规模/CPU-内存密集型(如深度学习、图应用等),需要动态的内存-计算资源配给。与此同时,现有解决方案致力于为多层内存系统设计页面管理策略,以高效运行重负载任务。Compute Express Link(CXL)凭借其缓存一致性特性与大容量内存,为无服务器工作负载运行时提供了统一的内存命名空间理想平台。然而,简单地将无服务器应用卸载至CXL会带来显著延迟。本研究首先量化了CXL对各类无服务器应用的影响;其次,论证了通过创建识别层(shim layer)将热区域精细粒度地部署至DRAM、冷/温区域分配至CXL,能够以应用定制化方式为无服务器工作负载配给DRAM与CML的机会;最后,基于上述观察提出Porter原型——一种介于现代无服务器架构与CXL分层内存系统之间的中间件,以实现内存资源的高效利用与成本节约。