The view synchronization problem lies at the heart of many Byzantine Fault Tolerant (BFT) State Machine Replication (SMR) protocols in the partial synchrony model, since these protocols are usually based on views. Liveness is guaranteed if honest processors spend a sufficiently long time in the same view during periods of synchrony, and if the leader of the view is honest. Ensuring that these conditions occur, known as Byzantine View Synchronization (BVS), has turned out to be the performance bottleneck of many BFT SMR protocols. A recent line of work has shown that, by using an appropriate view synchronization protocol, BFT SMR protocols can achieve $O(n^2)$ communication complexity in the worst case after GST, thereby finally matching the lower bound established by Dolev and Reischuk in 1985. However, these protocols suffer from two major issues: (1) When implemented so as to be optimistically responsive, even a single Byzantine processor may infinitely often cause $\Omega(n\Delta)$ latency between consecutive consensus decisions. (2) Even in the absence of Byzantine action, infinitely many views require honest processors to send $\Omega(n^2)$ messages. Here, we present Lumiere, an optimistically responsive BVS protocol which maintains optimal worst-case communication complexity while simultaneously addressing the two issues above: for the first time, Lumiere enables BFT consensus solutions in the partial synchrony setting that have $O(n^2)$ worst-case communication complexity, and that eventually always (i.e., except for a small constant number of "warmup" decisions) have communication complexity and latency which is linear in the number of actual faults in the execution.
翻译:视图同步问题位于部分同步模型中许多拜占庭容错(BFT)状态机复制(SMR)协议的核心,因为这些协议通常基于视图机制运行。诚实节点在同步期间于同一视图内停留足够长时间,且该视图的领导者为诚实节点时,协议才能保证活性。确保这些条件成立的过程(称为拜占庭视图同步,BVS)已成为众多BFT SMR协议的性能瓶颈。近期研究表明,通过采用合适的视图同步协议,BFT SMR协议在GST后最坏情况下的通信复杂度可达$O(n^2)$,从而最终匹配Dolev与Reischuk于1985年建立的下界。然而,现有协议存在两大问题:(1)若采用乐观响应式实现,单个拜占庭处理器可能无限次导致相邻共识决策间产生$\Omega(n\Delta)$延迟;(2)即使在无拜占庭行为的情况下,无限次视图仍需诚实处理器发送$\Omega(n^2)$条消息。本文提出Lumiere——一种在保持最优最坏情况通信复杂度的同时解决上述问题的乐观响应式BVS协议:Lumiere首次使得部分同步环境下BFT共识解决方案具备$O(n^2)$最坏情况通信复杂度,并且除少数“预热”决策外,其通信复杂度与延迟最终始终(即持续)与实际执行中的故障数量呈线性关系。