Set reconciliation, where two parties hold fixed-length bit strings and run a protocol to learn the strings they are missing from each other, is a fundamental task in many distributed systems. We present Rateless Invertible Bloom Lookup Tables (Rateless IBLT), the first set reconciliation protocol, to the best of our knowledge, that achieves low computation cost and near-optimal communication cost across a wide range of scenarios: set differences of one to millions, bit strings of a few bytes to megabytes, and workloads injected by potential adversaries. Rateless IBLT is based on a novel encoder that incrementally encodes the set difference into an infinite stream of coded symbols, resembling rateless error-correcting codes. We compare Rateless IBLT with state-of-the-art set reconciliation schemes and demonstrate significant improvements. Rateless IBLT achieves 3--4x lower communication cost than non-rateless schemes with similar computation cost, and 2--2000x lower computation cost than schemes with similar communication cost. We show the real-world benefits of Rateless IBLT by applying it to synchronize the state of the Ethereum blockchain, and demonstrate 5.6x lower end-to-end completion time and 4.4x lower communication cost compared to the system used in production.
翻译:集合协商是许多分布式系统中的基本任务,其中两方持有固定长度的比特串,并通过运行协议来获取彼此缺失的字符串。我们提出了无速率可逆布鲁姆查找表(Rateless IBLT),据我们所知,这是首个在广泛场景中实现低计算成本和近最优通信成本的集合协商协议:涵盖从一个到数百万的集合差异、从几个字节到兆字节的比特串,以及潜在对手注入的工作负载。Rateless IBLT基于一种新颖的编码器,该编码器将集合差异增量编码为无限长的编码符号流,类似于无速率纠错码。我们将Rateless IBLT与最先进的集合协商方案进行比较,并展示了显著的改进。与计算成本相似的非无速率方案相比,Rateless IBLT实现了3至4倍的通信成本降低;而与通信成本相似的方案相比,计算成本降低了2至2000倍。通过将Rateless IBLT应用于同步以太坊区块链的状态,我们展示了其实际优势,与生产环境中使用的系统相比,端到端完成时间降低了5.6倍,通信成本降低了4.4倍。