Replicated append-only logs sequentially order messages from the same author such that their ordering can be eventually recovered even with out-of-order and unreliable dissemination of individual messages. They are widely used for implementing replicated services in both clouds and peer-to-peer environments because they provide simple and efficient incremental reconciliation. However, existing designs of replicated append-only logs assume replicas faithfully maintain the sequential properties of logs and do not provide eventual consistency when malicious participants fork their logs by disseminating different messages to different replicas for the same index, which may result in partitioning of replicas according to which branch was first replicated. In this paper, we present 2P-BFT-Log, a two-phases replicated append-only log that provides eventual consistency in the presence of forks from malicious participants such that all correct replicas will eventually agree either on the most recent message of a valid log (first phase) or on the earliest point at which a fork occurred as well as on an irrefutable proof that it happened (second phase). We provide definitions, algorithms, and proofs of the key properties of the design, and explain one way to implement the design onto Git, an eventually consistent replicated database originally designed for distributed version control. Our design enables correct replicas to faithfully implement the happens-before relationship first introduced by Lamport that underpins most existing distributed algorithms, with eventual detection of forks from malicious participants to exclude the latter from further progress. This opens the door to adaptations of existing distributed algorithms to a cheaper detect and repair paradigm, rather than the more common and expensive systematic prevention of incorrect behaviour.
翻译:复制式仅追加日志按顺序排列同一作者的消息,使得即使在消息乱序且不可靠传播的情况下,最终也能恢复其顺序。它们广泛应用于云环境和点对点环境中的复制服务实现,因为能提供简单高效的增量协调。然而,现有复制式仅追加日志的设计假设副本忠实维护日志的顺序属性,当恶意参与者针对同一索引向不同副本传播不同消息(即分叉日志)时,无法提供最终一致性——这可能导致副本根据先复制的分支而分裂。本文提出2P-BFT-Log,一种两阶段复制式仅追加日志,能在恶意参与者实施分叉时提供最终一致性:所有正确副本最终将就有效日志的最新消息达成一致(第一阶段),或就分叉发生的最早位置及其不可辩驳的证据达成一致(第二阶段)。我们给出设计的关键属性定义、算法与证明,并阐述将该设计实现于Git(一种为分布式版本控制而设计的最终一致复制式数据库)的一种方法。我们的设计使正确副本能忠实实现Lamport首创的happens-before关系(该关系支撑了多数现有分布式算法),并最终检测恶意参与者的分叉行为以将其排除出后续进程。这为现有分布式算法适配更廉价的"检测-修复"范式(而非更常见且成本较高的系统性错误行为预防范式)开辟了道路。