Dynamic analyses are a standard approach to analyzing and testing concurrent programs. Such techniques observe program traces and analyze them to infer the presence or absence of bugs. At its core, each analysis maintains a partial order $P$ that represents order dependencies between events of the analyzed trace $\sigma$. Naturally, the scalability of the analysis largely depends on how efficiently it maintains $P$. The standard data structure for this task has thus far been vector clocks. These, however, are slow for analyses that follow a non-streaming style, costing $O(n)$ for inserting (and propagating) each new ordering in $P$, where $n$ is the size of $\sigma$, while they cannot handle the deletion of existing orderings. In this paper we develop collective sparse segment trees (CSSTs), a simple but elegant data structure for generically maintaining a partial order $P$. CSSTs thrive when the width $k$ of $P$ is much smaller than the size $n$ of its domain, allowing inserting, deleting, and querying for orderings in $P$ to run in $O(logn)$ time. For a concurrent trace, $k$ is bounded by the number of its threads, and is normally orders of magnitude smaller than its size $n$, making CSSTs fitting for this setting. Our experimental results confirm that CSSTs are the best data structure currently to handle a range of dynamic analyses from existing literature.
翻译:动态分析是分析和测试并发程序的标准方法。此类技术通过观察程序轨迹并对其进行分析,以推断是否存在错误。其核心在于,每种分析都维护一个偏序关系 $P$,该关系表示被分析轨迹 $\sigma$ 中各事件之间的顺序依赖。自然,分析的可扩展性在很大程度上取决于其维护 $P$ 的效率。目前,此类任务的标准数据结构是向量时钟。然而,对于非流式分析,向量时钟执行速度较慢,插入(并传播)$P$ 中每个新排序关系需要 $O(n)$ 的时间(其中 $n$ 是 $\sigma$ 的大小),且无法处理现有排序关系的删除。本文提出了集体稀疏线段树(CSSTs),一种简单而优雅的数据结构,用于通用地维护偏序关系 $P$。当 $P$ 的宽度 $k$ 远小于其域的大小 $n$ 时,CSSTs 能够高效运行,使得在 $P$ 中插入、删除和查询排序关系的时间复杂度为 $O(\log n)$。对于并发轨迹而言,$k$ 受其线程数的限制,通常比其大小 $n$ 小几个数量级,这使得 CSSTs 非常适用于此类场景。我们的实验结果证实,CSSTs 是目前处理现有文献中一系列动态分析的最佳数据结构。