Fundamental Limits of Hypergraph Edge Partitioning under Independent Edge Sampling

Hypergraph edge partitioning is a central problem in theoretical and applied computer science, with broad impact on distributed computation, communications, optimization, and machine learning. In this setting, one is given a collection of hyperedges -- each consisting of up to $d$ vertices from a ground set of size $n$ -- and seeks to assign these hyperedges across $N$ partitions so as to minimize, for example, the vertex footprint, i.e., the maximum number of vertices that appear in any partition. We here identify the fundamental limits of hypergraph edge partitioning -- optimized over all conceivable algorithms -- for a broad class of probabilistic hypergraph models where each hyperedge may appear independently with \emph{its own} probability; a model sufficiently general to encompass well-known models such as the Degree-Corrected or Mixed-Membership models, the Hypergraph Stochastic Block model, the Latent-Space/Geometric or Kernel Models, and others. By pairing our deterministic partitioner with a new converse, we first show that, for any $n,d$, and under the very mild condition of $N \leq \binom{\lfloor\sqrt{\frac{nd}{2}}\rfloor}{d}$, as long as the hyperedge set $\mathbf{X}$ satisfies $|\mathbf{X}| \gtrsim n N \log N$, then with probability at least $1-2/3n^z$, no algorithm can provide a footprint $π_{\mathbf{X}}$ less than $$π^{\bigstar}_{\mathbf{X}} = \frac{1}{2\sqrt{2}}\frac{n}{N^{1/d}}. $$ We then show that our hypergraph partitioner comes to within a small constant factor from $π^{\bigstar}_{\mathbf{X}}$, for each $\mathbf{X}$. This optimality captures dense and sparse hypergraphs alike (with sizes down to linear in $n$), and it additionally entails a near-optimally balanced allocation of hyperedges across partitions.

翻译：[中文摘要] 超图边划分是理论计算机科学与应用计算机科学中的核心问题，对分布式计算、通信、优化及机器学习具有广泛影响。在该问题中，给定一个由最多包含$d$个顶点（来自大小为$n$的基集）的超边组成的集合，目标是将这些超边分配到$N$个分区中，以最小化例如顶点足迹（即任意分区中出现的最大顶点数）。本文针对一类广泛的概率超图模型（每个超边可独立地以*自身*概率出现；该模型足够通用，可涵盖度修正模型、混合隶属模型、超图随机分块模型、潜在空间/几何或核模型等），识别了超图边划分的基本极限——在所有可想象算法中优化得到。通过将我们的确定性划分器与新的逆命题配对，首先证明：对于任意$n,d$，且在$N \leq \binom{\lfloor\sqrt{\frac{nd}{2}}\rfloor}{d}$的极弱条件下，只要超边集$\mathbf{X}$满足$|\mathbf{X}| \gtrsim n N \log N$，则至少以概率$1-2/3n^z$，任何算法都无法提供小于$$π^{\bigstar}_{\mathbf{X}} = \frac{1}{2\sqrt{2}}\frac{n}{N^{1/d}}$$的足迹$π_{\mathbf{X}}$。进一步证明，对于每个$\mathbf{X}$，我们的超图划分器能达到$π^{\bigstar}_{\mathbf{X}}$的小常数因子范围内。该最优性同时涵盖了稠密与稀疏超图（规模可低至与$n$线性相关），并进一步实现了近乎最优平衡的超边跨分区分配。