The view synchronization problem lies at the heart of many Byzantine Fault Tolerant (BFT) State Machine Replication (SMR) protocols in the partial synchrony model, since these protocols are usually based on views. Liveness is guaranteed if honest processors spend a sufficiently long time in the same view during periods of synchrony, and if the leader of the view is honest. Ensuring that these conditions occur, known as Byzantine View Synchronization (BVS), has turned out to be the performance bottleneck of many BFT SMR protocols. A recent line of work has shown that, by using an appropriate view synchronization protocol, BFT SMR protocols can achieve $O(n^2)$ communication complexity in the worst case after GST, thereby finally matching the lower bound established by Dolev and Reischuk in 1985. However, these protocols suffer from two major issues: (1) When implemented so as to be optimistically responsive, even a single Byzantine processor may infinitely often cause $\Omega(n\Delta)$ latency between consecutive consensus decisions. (2) Even in the absence of Byzantine action, infinitely many views require honest processors to send $\Omega(n^2)$ messages. Here, we present Lumiere, an optimistically responsive BVS protocol which maintains optimal worst-case communication complexity while simultaneously addressing the two issues above: for the first time, Lumiere enables BFT consensus solutions in the partial synchrony setting that have $O(n^2)$ worst-case communication complexity, and that eventually always (i.e., except for a small constant number of "warmup" decisions) have communication complexity and latency which is linear in the number of actual faults in the execution.
翻译:视图同步问题位于部分同步模型下许多拜占庭容错(BFT)状态机复制(SMR)协议的核心,因为这些协议通常基于视图机制运行。在同步期间,若诚实节点在相同视图中停留足够长时间且该视图的领导者为诚实节点,则可保证活跃性。确保这些条件成立的过程(即拜占庭视图同步,BVS)已成为众多BFT SMR协议的性能瓶颈。最新研究表明,通过采用合适的视图同步协议,BFT SMR协议在GST后的最坏情况下可实现$O(n^2)$通信复杂度,最终达到Dolev与Reischuk于1985年确立的已知下界。然而,此类协议存在两大问题:(1)若实现为乐观响应式,单个拜占庭节点可能无限次导致连续共识决策间产生$\Omega(n\Delta)$延迟;(2)即便无拜占庭行为,无限次视图仍需诚实节点发送$\Omega(n^2)$条消息。本文提出Lumiere——一种保持最优最坏情况通信复杂度的乐观响应式BVS协议,同时解决上述两大问题:Lumiere首次使部分同步环境下的BFT共识方案具备$O(n^2)$最坏情况通信复杂度,且最终(除少量“预热”决策外)其通信复杂度与延迟均与执行中实际故障数呈线性关系。