We study the estimation problem for concurrent programs: given a bounded program $P$, estimate the number of Mazurkiewicz trace-equivalence classes induced by its interleavings. This quantity informs two practical questions for enumeration-based model checking: how long a model checking run is likely to take, and what fraction of the search space has been covered so far. We first show the counting problem is #P-hard even for restricted programs and, unless $P=NP$, inapproximable within any subexponential factor, ruling out efficient exact or randomized approximation algorithms. We give a Monte Carlo approach to obtain a poly-time unbiased estimator: we convert a stateless optimal DPOR algorithm into an unbiased estimator by viewing its exploration as a bounded-depth, bounded-width tree whose leaves are the maximal Mazurkiewicz traces. A classical estimator by Knuth, when run on this tree, yields an unbiased estimate. To control the variance, we apply stochastic enumeration by maintaining a small population of partial paths per depth whose evolution is coupled. We have implemented our estimator in the JMC model checker and evaluated it on shared-memory benchmarks. With modest budgets, our estimator yields stable estimates, typically within a 20% band, within a few hundred trials, even when the state space has $10^5$--$10^6$ classes. We also show how the same machinery estimates model-checking cost by weighting all explored graphs, not only complete traces. Our algorithms provide the first provable poly-time unbiased estimators for counting traces, a problem of considerable importance when allocating model checking resources.
翻译:我们研究并发程序的估计问题:给定一个有界程序$P$,估计由其交错执行生成的Mazurkiewicz迹等价类的数量。该信息为基于枚举的模型检查提供了两个实际问题的答案:一次模型检查运行可能耗时多久,以及搜索空间的覆盖比例。我们首先证明,即使在受限程序中,计数问题也是#P困难的,并且除非$P=NP$,否则无法在任意次指数因子内近似,这排除了高效的精确或随机近似算法。我们提出一种蒙特卡罗方法以获得多项式时间的无偏估计器:通过将无状态最优DPOR算法的探索视为一棵有界深度、有界宽度的树,其叶子节点为最大Mazurkiewicz迹,我们将该算法转化为无偏估计器。Knuth提出的经典估计器在此树上运行可得到无偏估计。为控制方差,我们采用随机枚举方法,在每层维护少量部分路径的群体,并耦合其演化。我们在JMC模型检查器中实现了该估计器,并在共享内存基准测试上进行了评估。在适度预算下,我们的估计器能够在几百次试验内获得稳定估计(通常在20%的误差带内),即使状态空间包含$10^5$--$10^6$个等价类。我们还展示了如何通过加权所有探索图(而不仅是完整迹)来估计模型检查成本。我们的算法首次提供了可证明的多项式时间无偏估计器以计数迹,这在分配模型检查资源时具有重要价值。