This paper gives a new algorithm for sampling tree-weighted partitions of a large class of planar graphs. Formally, the tree-weighted distribution on $k$-partitions of a graph weights $k$-partitions proportional to the product of the number of spanning trees of each partition class. Recent work on computational redistricting analysis has driven special interest in the conditional distribution where all partition classes have the same size (balanced partitions). One class of Markov chains in wide use aims to sample from balanced tree-weighted $k$-partitions using a sampler for balanced tree-weighted 2-partitions. Previous implementations of this 2-partition sampler would draw a random spanning tree and check whether it contains an edge whose removal produces a balanced 2-component forest, rejecting if not. In practice, this is a significant computational bottleneck. We show that in fact it is possible to sample from the balanced tree-weighted 2-partition distribution directly, without first sampling a spanning tree; the acceptance and rejection rates are the same as in previous samplers. We prove that on a wide class of planar graphs encompassing network structures typically arising from the geographic data used in computational redistricting, our algorithm takes expected linear time $O(n)$. Notably, this is asymptotically faster than the best known method to generate random trees, which is $O(n \log^2 n)$ for approximate sampling and $O(n^{1 + \log \log \log n / \log \log n})$ for exact sampling. Additionally, we show that a variant of our algorithm also gives a speedup to $O(n \log n)$ for exact sampling of uniformly random trees on these families of graphs, improving the bounds for both exact and approximate sampling. We implement our algorithm and benchmark it on grid graphs, finding that it outperforms the standard bipartitioning method in the widely-used GerryChain library.
翻译:本文提出了一种新算法,用于对一大类平面图的树木加权划分进行采样。形式上,图的 $k$-划分的树木加权分布根据每个划分类中生成树数量的乘积对 $k$-划分进行加权。近期计算重划区分析领域的工作特别关注所有划分类大小相等(平衡划分)的条件分布。一类广泛使用的马尔可夫链旨在通过平衡树木加权2-划分的采样器来采样平衡树木加权 $k$-划分。此前的2-划分采样器实现会先随机抽取一棵生成树,然后检查其是否存在一条边使得移除该边后产生一个平衡的2-分量森林,若不存在则拒绝。实践中,这构成了显著的计算瓶颈。我们证明,实际上可以直接从平衡树木加权2-划分分布中采样,而无需先采样生成树;其接受率和拒绝率与先前采样器相同。我们证明,在包含通常由计算重划区所用地理数据产生的网络结构的一大类平面图上,我们的算法期望运行时间为线性 $O(n)$。值得注意的是,这比已知最佳的随机树生成方法渐近更快:对于近似采样,该方法复杂度为 $O(n \log^2 n)$;对于精确采样,为 $O(n^{1 + \log \log \log n / \log \log n})$。此外,我们展示了本算法的一个变体还能将这些图族上的均匀随机树精确采样加速至 $O(n \log n)$,从而改进了精确采样和近似采样的复杂度上界。我们在网格图上实现并基准测试了该算法,发现其性能优于广泛使用的GerryChain库中的标准二划分方法。