Multiparty session types are designed to abstractly capture the structure of communication protocols and verify behavioural properties. One important such property is progress, i.e., the absence of deadlock. Distributed algorithms often resemble multiparty communication protocols. But proving their properties, in particular termination that is closely related to progress, can be elaborate. Since distributed algorithms are often designed to cope with faults, a first step towards using session types to verify distributed algorithms is to integrate fault-tolerance. We extend FTMPST (a version of fault-tolerant multiparty session types with failure patterns to represent system requirements for system failures such as unreliable communication and process crashes) by a novel, fault-tolerant loop construct with global escapes that does not require global coordination. Each process runs its own local version of the loop. If a process finds a solution to the considered problem, it does not only terminate its own loop but also informs the other participants via exit-messages. Upon receiving an exit-message, a process immediately terminates its algorithm. To increase efficiency and model standard fault-tolerant algorithms, these messages are non-blocking, i.e., a process may continue until a possibly delayed exit-message is received. To illustrate our approach, we analyse a variant of the well-known rotating coordinator algorithm by Chandra and Toueg.
翻译:多方会话类型旨在抽象地捕捉通信协议的结构并验证行为属性。其中一个重要属性是进展性,即无死锁。分布式算法通常类似于多方通信协议,但证明其属性(特别是与进展性密切相关的终止性)可能较为复杂。由于分布式算法常设计用于应对故障,使用会话类型验证分布式算法的第一步是集成容错机制。我们在FTMPST(一种具有故障模式的容错多方会话类型版本,用于表示系统故障如不可靠通信和进程崩溃的系统需求)基础上,引入了一种新颖的容错循环结构,该结构具有全局逃逸功能且无需全局协调。每个进程运行其自身的本地循环版本。若某进程找到所考虑问题的解,它不仅会终止自身循环,还会通过退出消息通知其他参与者。接收到退出消息后,进程立即终止其算法。为提高效率并建模标准容错算法,这些消息采用非阻塞方式,即进程可继续运行直至可能延迟到达的退出消息被接收。为说明本方法,我们分析了Chandra和Toueg提出的著名旋转协调器算法的一个变体。