Consider the classical Min-Sum Set Cover problem: We are given a universe $\mathcal{U}$ of $n$ elements and a collection $\mathcal{S}$ of $k$ subsets of $\mathcal{U}$. Moreover, a cost function is associated with each set. The goal is to find a subsequence of sets in $\mathcal{S}$ that covers all elements in $\mathcal{U}$, such that the sum of the covering times of the elements is minimized. The covering time of an element $u$ is the cost of all sets that appear in the sequence before $u$ is first covered. This problem can be seen as a scheduling problem on a single machine, where each job represents a set and elements are represented by some kind of utility that is required to be provided by at least one of the jobs. The goal is to schedule the jobs in such a way to minimize the sum of provision times of the utilities. In this paper we consider a natural generalization of this problem to the case of $m$ machines, processing the jobs in parallel. We call this problem Parallel Min-Sum Set Cover. To obtain approximation algorithms for both related and unrelated machines, we use a crucial subproblem which we call Parallel Maximum Coverage. We give a randomized bicriteria $(1-1/e-ε, O(\log m/\log\log m))$-approximation algorithm for this problem based on a natural LP relaxation. This can be then used to obtain $O(\log m/\log\log m)$-approximation algorithm for the Min-Sum Set Cover problem on unrelated machines. For related machines, we allow the aforementioned bicriteria approximation algorithm to run in FPT time, and apply a technique enabling transformation of a related machines instance into one consisting of $O(\log m)$ unrelated machines, to get an $\frac{8e}{e+1}+ε<12.66$-approximation algorithm for this case. We also show a greedy algorithm for unit cost sets, subject to precedence constraints, with an $O(k^{2/3})$ approximation ratio.
翻译:考虑经典的最小和集合覆盖问题:给定一个包含n个元素的宇宙$\mathcal{U}$和$\mathcal{U}$的k个子集的集合$\mathcal{S}$。此外,每个集合关联一个成本函数。目标是在$\mathcal{S}$中找到一个覆盖$\mathcal{U}$中所有元素的子序列,使得元素覆盖时间的总和最小化。元素$u$的覆盖时间是序列中在$u$首次被覆盖之前出现的所有集合的成本之和。该问题可视为单机调度问题,其中每个作业代表一个集合,元素代表某种需要由至少一个作业提供的效用。目标是调度作业,使效用提供时间的总和最小化。本文考虑该问题在$m$台机器上并行处理作业的自然推广,我们称之为并行最小和集合覆盖。为获得相关机和无关机的近似算法,我们使用一个关键子问题——并行最大覆盖。基于自然线性规划松弛,我们为该问题设计了一种随机双准则$(1-1/e-ε, O(\log m/\log\log m))$近似算法。该算法可用于获得无关机情况下最小和集合覆盖问题的$O(\log m/\log\log m)$近似算法。对于相关机情况,我们允许上述双准则近似算法以FPT时间运行,并应用一种技术将相关机实例转化为由$O(\log m)$台无关机组成的实例,从而得到该情况下的$\frac{8e}{e+1}+ε<12.66$近似算法。我们还针对服从优先约束的单位成本集合提出了一个具有$O(k^{2/3})$近似比的贪心算法。