Remote Direct Memory Access (RDMA) is a technology that allows direct memory access from the memory of one computer into that of another without involving either one's operating system. This enables high-throughput, low-latency networking, which is especially useful in massively parallel computer clusters. In this paper, we study the reachability and robustness problems for RDMA programs. We show that reachability is undecidable in general, even for a restricted fragment of the model. We then focus on robustness, which asks whether a program exhibits the same behaviours under the RDMA and sequential consistency (SC) semantics, and prove that this problem is decidable. Our central technical result establishes a normal form for robustness violations, showing that any non-robust program admits a violating execution of a specific form. We then leverage this normal form to obtain a decision procedure that reduces robustness to reachability in finite-state programs with counters, yielding an EXPSPACE upper bound in the general case, and a PSPACE upper bound in the absence of poll operations. Finally, we also show that both of these bounds are optimal.
翻译:远程直接内存访问(RDMA)是一种允许计算机在不涉及操作系统的情况下,直接访问另一台计算机内存的技术。这实现了高吞吐量、低延迟的网络通信,尤其适用于大规模并行计算机集群。本文研究了RDMA程序的可达性与鲁棒性问题。我们证明,即使在该模型的受限片段中,可达性在一般情况下也是不可判定的。随后我们聚焦于鲁棒性问题,即程序在RDMA语义与顺序一致性(SC)语义下是否表现出相同的行为,并证明该问题是可判定的。我们的核心技术成果建立了鲁棒性违规的规范范式,表明任何非鲁棒程序都存在一个特定形式的违规执行。我们利用这一规范范式提出了一种判定过程,将鲁棒性问题归结为带计数器有限状态程序的可达性问题,从而在一般情况下得到EXPSPACE上界,在不存在轮询操作的情况下得到PSPACE上界。最后,我们还证明了这两个上界均为最优上界。