The programmability of modern network devices has led to innovative research in the area of in-network computing, i.e., offloading certain computations to the programmable data plane. Key-value stores, which offer coordination services for many large-scale data centres, benefited from this technological advancement. Previous research reduced the response latency of key-value requests by half through deploying the store in the programmable data plane. In this work, we identify previous design decisions that have led to increased traffic generation and latency for in-network coordination services. We have developed a new in-network key-value store platform that maintains strong consistency and fault-tolerance, while improving performance and scalability over the state-of-the-art. We have designed and implemented the platform in P4, and analysed the optimisations that unlock these performance improvements. Our evaluation shows a reduction of up to orders of magnitude in latency and significant improvements in throughput. We obtain up to nine times higher throughput for scenarios with multiple participating nodes, indicative of the superior scalability the platform can offer.
翻译:现代网络设备的可编程性推动了网络内部计算领域的研究创新,即将特定计算任务卸载至可编程数据平面。为大规模数据中心提供协调服务的键值存储受益于这一技术进步。此前的研究通过将键值存储部署在可编程数据平面上,将请求响应延迟减半。本文识别了先前设计决策中导致网络内部协调服务产生额外流量和延迟的问题。我们开发了一种新型网络内部键值存储平台,在保持强一致性和容错性的同时,相较现有最优方案提升了性能与可扩展性。基于P4语言完成平台设计与实现,并分析了实现性能提升的优化机制。评估表明,该平台可实现延迟数量级降低与吞吐量显著提升。在多节点参与场景下,吞吐量最高可达九倍提升,充分体现了平台优越的可扩展性。