Provably correct distributed protocols, which are a critical component of modern distributed systems, are highly challenging to design and have often required decades of human effort. These protocols allow multiple agents to coordinate to come to a common agreement in an environment with uncertainty and failures. We formulate protocol design as a search problem over strategies in a game with imperfect information, and the desired correctness conditions are specified in Satisfiability Modulo Theories (SMT). However, standard methods for solving multi-agent games fail to learn correct protocols in this setting, even when the number of agents is small. We propose a learning framework, GGMS, which integrates a specialized variant of Monte Carlo Tree Search with a transformer-based action encoder, a global depth-first search to break out of local minima, and repeated feedback from a model checker. Protocols output by GGMS are verified correct via exhaustive model checking for all executions within the bounded setting. We further prove that, under mild assumptions, the search process is complete: if a correct protocol exists, GGMS will eventually find it. In experiments, we show that GGMS can learn correct protocols for larger settings than existing methods.
翻译:可证明正确的分布式协议是现代分布式系统的关键组成部分,其设计极具挑战性,通常需要数十年的人工努力。这些协议允许多个代理在存在不确定性和故障的环境中协调达成共识。我们将协议设计形式化为一个在不完美信息博弈中搜索策略的问题,并将期望的正确性条件用可满足性模理论(SMT)进行规约。然而,即使代理数量很少,现有的多智能体博弈求解方法也无法在此设定下学习到正确的协议。我们提出了一个学习框架GGMS,它集成了蒙特卡洛树搜索的一种专门变体、基于Transformer的动作编码器、用于跳出局部最优的全局深度优先搜索,以及来自模型检验器的重复反馈。GGMS输出的协议通过有界设定下对所有执行路径的穷举模型检验来验证其正确性。我们进一步证明,在温和假设下,该搜索过程是完备的:如果存在正确的协议,GGMS最终将找到它。实验表明,GGMS能够为比现有方法更大规模的设定学习到正确的协议。