Uniform sampling of bipartite graphs and hypergraphs with given degree sequences is necessary for building null models to statistically evaluate their topology. Because these graphs can be represented as binary matrices, the problem is equivalent to uniformly sampling $r \times c$ binary matrices with fixed row and column sums. The trade algorithm, which includes both the curveball and fastball implementations, is the state-of-the-art for performing such sampling. Its mixing time is currently unknown, although $5r$ is currently used as a heuristic. In this paper we propose a new distribution-based approach that not only provides an estimation of the mixing time, but also actually returns a sample of matrices that are guaranteed (within a user-chosen error tolerance) to be uniformly randomly sampled. In numerical experiments on matrices that vary by size, fill, and row and column sum distributions, we find that the upper bound on mixing time is at least $10r$, and that it increases as a function of both $c$ and the fraction of cells containing a 1.
翻译:均匀采样给定度数序列的二分图与超图,对于构建零模型以统计评估其拓扑结构至关重要。由于这些图可表示为二元矩阵,该问题等价于对具有固定行和与列和的$r \times c$二元矩阵进行均匀采样。包含curveball和fastball实现的trade算法是当前执行此类采样的最先进方法。尽管目前将$5r$作为启发式值使用,但其混合时间尚属未知。本文提出一种新的基于分布的方法,不仅能估算混合时间,还能实际返回一批保证(在用户选择的容差范围内)均匀随机采样的矩阵。在对不同大小、填充率及行列和分布的矩阵进行的数值实验中,我们发现混合时间的上界至少为$10r$,且该上界随$c$及包含1的单元格比例的增加而增大。