We present a solution to consensus on a torus with Byzantine faults. Any solution to classic consensus that is tolerant to $f$ Byzantine faults requires $2f+1$ node-disjoint paths. Due to limited torus connectivity, this bound necessitates spatial separation between faults. Our solution does not require this many disjoint paths and tolerates dense faults. Specifically, we consider the case where all faults are in the one column. We address the version of consensus where only processes in fault-free columns must agree. We prove that even this weaker version is not solvable if the column may be completely faulty. We then present a solution for the case where at least one row is fault-free. The correct processes share orientation but do not know the identities of other processes or the torus dimensions. The communication is synchronous. To achieve our solution, we build and prove correct an all-to-all broadcast algorithm \PROG{BAT} that guarantees delivery to all processes in fault-free columns. We use this algorithm to solve our weak consensus problem. Our solution, \PROG{CBAT}, runs in $O(H+W)$ rounds, where $H$ and $W$ are torus height and width respectively. We extend our consensus solution to the fixed message size model where it runs in $O(H^3W^2)$ rounds. Our results are immediately applicable if the faults are located in a single row, rather than a column.
翻译:我们提出了一种在具有拜占庭故障的环面上实现共识的解决方案。经典共识中任何容忍$f$个拜占庭故障的解决方案都需要$2f+1$条节点不相交路径。由于环面连通性有限,这一界限要求故障在空间上相互隔离。我们的解决方案不需要如此多的不相交路径,且能够容忍密集故障。具体而言,我们考虑所有故障均位于同一列的情况。我们研究的是仅需无故障列中的进程达成一致的共识版本。我们证明,即使这种弱化版本在列可能完全故障的情况下也是不可解的。随后,我们针对至少有一行无故障的情况提出了一种解决方案。正确的进程共享方位信息,但不知道其他进程的身份或环面维度。通信是同步的。为实现该解决方案,我们构建并证明了全对全广播算法\PROG{BAT},该算法保证将消息传递至所有无故障列中的进程。我们利用该算法解决了弱共识问题。我们的解决方案\PROG{CBAT}运行时间为$O(H+W)$轮,其中$H$和$W$分别为环面的高度和宽度。我们将共识解决方案扩展到固定消息大小模型,在该模型下其运行时间为$O(H^3W^2)$轮。若故障位于单行而非单列时,我们的结果同样适用。