Logical clocks are a fundamental tool to establish causal ordering of events in a distributed system. They have been applied in weakly consistent storage systems, causally ordered broadcast, distributed snapshots, deadlock detection, and distributed system debugging. However, prior logical clock constructs fail to work in an open network with Byzantine participants. In this work, we present Chrono, a novel logical clock system that targets such challenging environment. We first redefine causality properties among distributed processes under the Byzantine failure model. To enforce these properties, Chrono defines a new validator abstraction for building fault-tolerant logical clocks. Furthermore, our validator abstraction is customizable: Chrono includes multiple backend implementations for the abstraction, each with different security-performance trade-offs. We have applied Chrono to build two decentralized applications, a mutual exclusive service and a weakly consistent key-value store. Chrono adds only marginal overhead compared to systems that tolerate no Byzantine faults. It also out-performs state-of-the-art BFT total order protocols by significant margins.
翻译:逻辑时钟是分布式系统中确立事件因果顺序的基础工具,已应用于弱一致性存储系统、因果序广播、分布式快照、死锁检测及分布式系统调试等领域。然而,现有逻辑时钟架构无法在存在拜占庭参与者的开放网络中正常运行。本研究提出Chrono——一种面向此类挑战性环境的新型逻辑时钟系统。我们首先在拜占庭故障模型下重新定义分布式进程间的因果性属性。为强化这些属性,Chrono定义了用于构建容错逻辑时钟的新型验证器抽象层。此外,该验证器抽象层具备可定制性:Chrono为此抽象层提供了多种后端实现方案,每种方案均在安全性与性能间呈现不同的权衡特性。我们已应用Chrono构建了两个去中心化应用:互斥服务和弱一致性键值存储系统。相较于不容忍拜占庭故障的系统,Chrono仅产生边际开销;同时其性能显著优于当前最先进的BFT全序协议。