Automatic Parallelization of Software Network Functions

Software network functions (NFs) trade-off flexibility and ease of deployment for an increased challenge of performance. The traditional way to increase NF performance is by distributing traffic to multiple CPU cores, but this poses a significant challenge: how to parallelize an NF without breaking its semantics? We propose Maestro, a tool that analyzes a sequential implementation of an NF and automatically generates an enhanced parallel version that carefully configures the NIC's Receive Side Scaling mechanism to distribute traffic across cores, while preserving semantics. When possible, Maestro orchestrates a shared-nothing architecture, with each core operating independently without shared memory coordination, maximizing performance. Otherwise, Maestro choreographs a fine-grained read-write locking mechanism that optimizes operation for typical Internet traffic. We parallelized 8 software NFs and show that they generally scale-up linearly until bottlenecked by PCIe when using small packets or by 100Gbps line-rate with typical Internet traffic. Maestro further outperforms modern hardware-based transactional memory mechanisms, even for challenging parallel-unfriendly workloads.

翻译：软件网络功能（NFs）以灵活性及部署便捷性为代价，带来了性能提升的挑战。传统上，通过将流量分配到多个CPU核心来提高NF性能，但这面临一个关键问题：如何在保持语义不变的前提下实现NF的并行化？我们提出Maestro工具，它分析NF的顺序实现，自动生成增强的并行版本，通过精心配置网卡的接收端缩放机制（RSS）将流量分配到不同核心，同时保留原始语义。在可行时，Maestro构建无共享架构，使各核心独立运行而无需共享内存协调，从而最大化性能；否则，Maestro编排一种细粒度读写锁机制，针对典型互联网流量优化操作。我们对8个软件NF进行了并行化测试，结果显示：当使用小数据包时，其性能通常线性扩展至PCIe瓶颈；在典型互联网流量下，则可达到100Gbps线速。此外，即使对于不利于并行的负载，Maestro的性能仍优于现代基于硬件的内存事务机制。

相关内容

NFS

关注 0

NFS是一种分布式文件系统协议，最初由Sun Microsystems公司开发，并于1984年发布。[1]其功能旨在允许客户端主机可以像访问本地存储一样通过网络访问服务器端文件。 NFS和其他许多协议一样，是基于开放网络运算远程过程调用（ONC RPC）协议之上的。它是一个开放、标准的RFC协议，任何人或组织都可以依据标准实现它。 >

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日