Whereas distributed computing research has been very successful in exploring the solvability/impossibility border of distributed computing problems like consensus in representative classes of computing models with respect to model parameters like failure bounds, this is not the case for characterizing necessary and sufficient communication requirements. In this paper, we introduce network abstractions as a novel approach for modeling communication requirements in asynchronous distributed systems. A network abstraction of a run is a sequence of directed graphs on the set of processes, where the $i$-th graph specifies some ``potential'' message chains that can be guaranteed to arise in the $i$-th portion of the run. Formally, they are defined via associating message sending times with the end-to-end delays that would arise if the message was indeed sent by the sender's protocol. Network abstractions also allow to reason about future causal cones that might arise in a run, hence also facilitate reasoning about liveness properties, and are inherently compatible with temporal epistemic reasoning frameworks. We demonstrate the utility of our approach by providing necessary and sufficient network abstractions for solving the canonical firing rebels with relay (FRR) problem, and variants thereof, in asynchronous message-passing systems with up to $f$ byzantine processes connected via point-to-point links. FRR is not only a basic primitive in clock synchronization and consensus algorithms, but also integrates several distributed computing problems, namely triggering events, agreement and even stabilizing agreement, in a single problem instance.
翻译:尽管分布式计算研究在探索诸如共识等分布式计算问题在具有代表性计算模型类别中的可解性/不可能性边界方面取得了巨大成功,但就模型参数(如故障界限)而言,刻画必要且充分的通信需求方面却并非如此。本文引入网络抽象作为一种新颖方法,用于建模异步分布式系统中的通信需求。一次运行过程的网络抽象是定义在进程集合上的一系列有向图,其中第$i$个图规定了在该运行过程的第$i$个阶段可以保证出现的某些“潜在”消息链。形式上,它们通过将消息发送时间与若发送方协议确实发送该消息时将会产生的端到端延迟相关联来定义。网络抽象还允许推理运行过程中可能出现的未来因果锥,因此也有助于推理活性性质,并且本质上与时间认知推理框架兼容。我们通过为在具有点对点连接的异步消息传递系统中解决经典的带中继的起义者点火(FRR)问题及其变体提供必要且充分的网络抽象,证明了我们方法的实用性。该系统最多包含$f$个拜占庭进程。FRR不仅是时钟同步和共识算法中的基本原语,还在单个问题实例中整合了多个分布式计算问题,即触发事件、一致性乃至稳定一致性。