We design and implement two single-pass semi-streaming algorithms for the maximum weight $k$-disjoint matching ($k$-DM) problem. Given an integer $k$, the $k$-DM problem is to find $k$ pairwise edge-disjoint matchings such that the sum of the weights of the matchings is maximized. For $k \geq 2$, this problem is NP-hard. Our first algorithm is based on the primal-dual framework of a linear programming relaxation of the problem and is $\frac{1}{3+\varepsilon}$-approximate. We also develop an approximation preserving reduction from $k$-DM to the maximum weight $b$-matching problem. Leveraging this reduction and an existing semi-streaming $b$-matching algorithm, we design a $(\frac{1}{2+\varepsilon})(1 - \frac{1}{k+1})$-approximate semi-streaming algorithm for $k$-DM. For any constant $\varepsilon > 0$, both of these algorithms require $O(nk \log_{1+\varepsilon}^2 n)$ bits of space. To the best of our knowledge, this is the first study of semi-streaming algorithms for the $k$-DM problem. We compare our two algorithms to state-of-the-art offline algorithms on 95 real-world and synthetic test problems, including thirteen graphs generated from data center network traces. On these instances, our streaming algorithms used significantly less memory (ranging from 6$\times$ to 512$\times$ less) and were faster in runtime than the offline algorithms. Our solutions were often within 5% of the best weights from the offline algorithms. We highlight that the existing offline algorithms run out of 1 TB memory for most of the large instances ($>1$ billion edges), whereas our streaming algorithms can solve these problems using only 100 GB memory for $k=8$.
翻译:我们设计并实现了两种用于最大权重$k$-不相交匹配($k$-DM)问题的单遍半流式算法。给定整数$k$,$k$-DM问题旨在找到$k$个两两边不相交的匹配,使得匹配的权重之和最大化。对于$k \geq 2$,该问题是NP难的。我们的第一种算法基于该问题线性规划松弛的原对偶框架,具有$\frac{1}{3+\varepsilon}$近似比。我们还开发了从$k$-DM到最大权重$b$-匹配问题的近似保持归约。利用此归约和现有的半流式$b$-匹配算法,我们为$k$-DM设计了一种$(\frac{1}{2+\varepsilon})(1 - \frac{1}{k+1})$近似比的半流式算法。对于任意常数$\varepsilon > 0$,这两种算法均需要$O(nk \log_{1+\varepsilon}^2 n)$比特的存储空间。据我们所知,这是针对$k$-DM问题的半流式算法的首次研究。我们将两种算法与最先进的离线算法在95个真实世界和合成测试问题上进行了比较,其中包括基于数据中心网络轨迹生成的十三张图。在这些实例上,我们的流式算法使用的内存显著更少(少6倍至512倍),且运行时间更快。我们的解通常与离线算法的最佳权重相差在5%以内。我们特别指出,现有离线算法在大多数大型实例(边数>10亿)上会耗尽1 TB内存,而我们的流式算法在$k=8$时仅需100 GB内存即可求解这些问题。