RACS与SADL：迈向广域网中稳健的状态机复制 (RACS and SADL: Towards Robust SMR in the Wide-Area Network)

Widely deployed consensus protocols in the cloud are often leader-based and optimized for low latency under synchronous network conditions. However, cloud networks can experience disruptions such as network partitions, high-loss links, and configuration errors. These disruptions interfere with the operation of leader-based protocols, as their view change mechanisms interrupt the normal case replication and cause the system to stall. This paper proposes RACS, a novel randomized consensus protocol that ensures robustness against adversarial network conditions. RACS achieves optimal one-round trip latency under synchronous network conditions while remaining resilient to adversarial network conditions. RACS follows a simple design inspired by Raft, the most widely used consensus protocol in the cloud, and therefore enables seamless integration with the existing cloud software stack -- a goal no previous asynchronous protocol has successfully achieved. Experiments with a prototype deployed on Amazon EC2 confirm that RACS achieves a throughput of 28k cmd/sec under adversarial cloud network conditions, whereas existing leader-based protocols such as Multi-Paxos and Raft provide less than 2.8k cmd/sec. Under synchronous network conditions, RACS matches the performance of Multi-Paxos and Raft, achieving a throughput of 200k cmd/sec with a latency of 300ms, confirming that RACS introduces no unnecessary overhead. Finally, SADL-RACS-an optimized version of RACS designed for high performance and robustness-achieves an impressive throughput of 500k cmd/sec under synchronous network conditions and 196k cmd/sec under adversarial network conditions, further enhancing both performance and robustness.

翻译：云环境中广泛部署的共识协议通常采用领导者模式，并在同步网络条件下针对低延迟进行优化。然而，云网络可能遭遇诸如网络分区、高丢包链路及配置错误等中断。这些中断会干扰基于领导者的协议运行，因为其视图变更机制会中断常规情况下的复制流程，导致系统停滞。本文提出RACS，一种新颖的随机化共识协议，确保在对抗性网络条件下保持稳健性。RACS在同步网络条件下实现最优的单轮往返延迟，同时保持对对抗性网络条件的弹性。RACS遵循受Raft（云中最广泛使用的共识协议）启发的简洁设计，因此能够与现有云软件栈无缝集成——这是以往任何异步协议均未能成功实现的目标。在亚马逊EC2上部署原型的实验证实，RACS在对抗性云网络条件下达到28k cmd/sec的吞吐量，而现有的基于领导者的协议（如Multi-Paxos和Raft）则提供不足2.8k cmd/sec的吞吐量。在同步网络条件下，RACS的性能与Multi-Paxos和Raft相当，实现200k cmd/sec的吞吐量及300ms的延迟，证实RACS未引入不必要的开销。最后，SADL-RACS——为高性能与稳健性优化的RACS改进版本——在同步网络条件下实现500k cmd/sec的卓越吞吐量，在对抗性网络条件下达到196k cmd/sec的吞吐量，进一步提升了性能与稳健性。

相关内容

Networking

关注 23

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日