Wave: A Split OS Architecture for Application Engines

The end of Moore's Law and the tightening performance requirements in today's clouds make re-architecting the software stack a necessity. To address this, cloud providers and vendors offload the virtualization control plane and data plane, along with the host OS data plane, to IPUs (SmartNICs), recovering scarce host resources that are then used by applications. However, the host OS control plane--encompassing kernel thread scheduling, memory management, the network stack, file systems, and more--is left on the host CPU and degrades workload performance. This paper presents Wave, a split OS architecture that moves OS subsystem policies to the IPU while keeping OS mechanisms on the host CPU. Wave not only frees host CPU resources, but it reduces host workload interference and leverages network insights on the IPU to improve policy decisions. Wave makes OS control plane offloading practical despite high host-IPU communication latency, lack of a coherent interconnect, and operation across two system images. We present Wave's design and implementation, and implement several OS subsystems in Wave, including kernel thread scheduling, the control plane for a network stack, and memory management. We then evaluate the Wave subsystems on Stubby (scheduling and network), our GCE VM service (scheduling), and RocksDB (memory management and scheduling). We demonstrate that Wave subsystems are competitive with and often superior to on-host subsystems, saving 8 host CPUs for Stubby, 16 host CPUs for database memory management, and improving VM performance by up to 11.2%.

翻译：摩尔定律的终结与当今云环境日益严苛的性能要求，使得重构软件栈成为必然。为此，云服务提供商与供应商将虚拟化控制平面、数据平面以及主机操作系统的数据平面卸载至IPU（智能网卡），从而释放稀缺的主机资源供应用程序使用。然而，主机操作系统的控制平面——包括内核线程调度、内存管理、网络协议栈、文件系统等——仍驻留在主机CPU上，并导致工作负载性能下降。本文提出Wave，一种分体式操作系统架构，将操作系统子系统的策略决策迁移至IPU，同时将操作系统机制保留在主机CPU上。Wave不仅释放了主机CPU资源，还减少了主机工作负载间的干扰，并利用IPU上的网络洞察优化策略决策。尽管面临主机-IPU通信延迟较高、缺乏一致性互连以及跨两个系统镜像运行等挑战，Wave仍实现了操作系统控制平面卸载的实用化。本文阐述了Wave的设计与实现，并在Wave中实现了多个操作系统子系统，包括内核线程调度、网络协议栈控制平面及内存管理。随后，我们在Stubby（调度与网络）、GCE虚拟机服务（调度）以及RocksDB（内存管理与调度）上对Wave子系统进行了评估。实验表明，Wave子系统与主机内子系统性能相当且往往更优：为Stubby节省8个主机CPU核心，为数据库内存管理节省16个主机CPU核心，并将虚拟机性能提升最高达11.2%。

相关内容

Networking

关注 23

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

37+阅读 · 2019年10月17日