Sound Dynamic Deadlock Prediction in Linear Time

Deadlocks are one of the most notorious concurrency bugs, and significant research has focused on detecting them efficiently. Dynamic predictive analyses work by observing concurrent executions, and reason about alternative interleavings that can witness concurrency bugs. Such techniques offer scalability and sound bug reports, and have emerged as an effective approach for concurrency bug detection, such as data races. Effective dynamic deadlock prediction, however, has proven a challenging task, as no deadlock predictor currently meets the requirements of soundness, high-precision, and efficiency. In this paper, we first formally establish that this tradeoff is unavoidable, by showing that (a) sound and complete deadlock prediction is intractable, in general, and (b) even the seemingly simpler task of determining the presence of potential deadlocks, which often serve as unsound witnesses for actual predictable deadlocks, is intractable. The main contribution of this work is a new class of predictable deadlocks, called sync(hronization)-preserving deadlocks. Informally, these are deadlocks that can be predicted by reordering the observed execution while preserving the relative order of conflicting critical sections. We present two algorithms for sound deadlock prediction based on this notion. Our first algorithm SPDOffline detects all sync-preserving deadlocks, with running time that is linear per abstract deadlock pattern, a novel notion also introduced in this work. Our second algorithm SPDOnline predicts all sync-preserving deadlocks that involve two threads in a strictly online fashion, runs in overall linear time, and is better suited for a runtime monitoring setting. We implemented both our algorithms and evaluated their ability to perform offline and online deadlock-prediction on a large dataset of standard benchmarks.

翻译：死锁是最臭名昭著的并发错误之一，大量研究专注于如何高效检测它们。动态预测分析通过观察并发执行过程，并推理可能见证并发错误的其他交错执行序列来开展工作。这类技术具有可扩展性和可靠的错误报告，已成为并发错误检测（如数据竞争）的有效方法。然而，有效的动态死锁预测被证明是一项具有挑战性的任务，因为目前尚无任何死锁预测器能同时满足可靠性、高精度和效率的要求。在本文中，我们首先正式证明这种权衡是不可避免的，通过展示：(a) 可靠且完备的死锁预测在一般情况下是棘手的，且(b) 即使是看似更简单的任务——确定潜在死锁的存在（这些潜在死锁通常作为实际可预测死锁的不可靠证据）也是棘手的。本工作的主要贡献是提出了一类新的可预测死锁，称为同步保持死锁。直观地说，这些死锁可以通过重新排序观察到的执行过程（同时保持冲突临界区的相对顺序）来预测。我们提出了两种基于该概念的可靠死锁预测算法。第一种算法SPDOffline检测所有同步保持死锁，其运行时间与每个抽象死锁模式（本文提出的另一个新概念）成线性关系。第二种算法SPDOnline以严格在线方式预测涉及两个线程的所有同步保持死锁，总运行时间为线性时间，更适合运行时监控场景。我们实现了这两种算法，并在标准基准测试的大型数据集上评估了它们执行离线和在线死锁预测的能力。