Dynamic analyses are a standard approach to analyzing and testing concurrent programs. Such techniques observe program traces and analyze them to infer the presence or absence of bugs. At its core, each analysis maintains a partial order $P$ that represents order dependencies between events of the analyzed trace $\sigma$. Naturally, the scalability of the analysis largely depends on how efficiently it maintains $P$. The standard data structure for this task has thus far been vector clocks. These, however, are slow for analyses that follow a non-streaming style, costing $O(n)$ for inserting (and propagating) each new ordering in $P$, where $n$ is the size of $\sigma$, while they cannot handle the deletion of existing orderings. In this paper we develop collective sparse segment trees (CSSTs), a simple but elegant data structure for generically maintaining a partial order $P$. CSSTs thrive when the width $k$ of $P$ is much smaller than the size $n$ of its domain, allowing inserting, deleting, and querying for orderings in $P$ to run in $O(logn)$ time. For a concurrent trace, $k$ is bounded by the number of its threads, and is normally orders of magnitude smaller than its size $n$, making CSSTs fitting for this setting. Our experimental results confirm that CSSTs are the best data structure currently to handle a range of dynamic analyses from existing literature.
翻译:动态分析是分析和测试并发程序的标准方法。此类技术通过观察程序轨迹并进行分析,推断其中是否存在错误。其核心在于,每次分析都会维护一个偏序关系 $P$,该关系表示被分析轨迹 $\sigma$ 中事件之间的顺序依赖。显然,分析的可扩展性在很大程度上取决于维护 $P$ 的效率。迄今为止,用于此任务的标准数据结构是向量时钟。然而,对于非流式分析,向量时钟的性能较低,插入(及传播)每个新序关系的时间开销为 $O(n)$(其中 $n$ 为 $\sigma$ 的大小),且无法处理现有序关系的删除。本文提出了集体稀疏线段树(CSSTs),这是一种简单而优雅的数据结构,可泛化地维护偏序关系 $P$。当 $P$ 的宽度 $k$ 远小于其域的大小 $n$ 时,CSSTs 能够以 $O(\log n)$ 的时间复杂度完成序关系的插入、删除和查询操作。对于并发轨迹而言,$k$ 受限于其线程数量,通常比规模 $n$ 小数个数量级,这使得 CSSTs 非常适合该场景。我们的实验结果证实,CSSTs 是目前处理现有文献中一系列动态分析任务的最佳数据结构。