Fast Parallel Algorithms for Submodular $p$-Superseparable Maximization

Maximizing a non-negative, monontone, submodular function $f$ over $n$ elements under a cardinality constraint $k$ (SMCC) is a well-studied NP-hard problem. It has important applications in, e.g., machine learning and influence maximization. Though the theoretical problem admits polynomial-time approximation algorithms, solving it in practice often involves frequently querying submodular functions that are expensive to compute. This has motivated significant research into designing parallel approximation algorithms in the adaptive complexity model; adaptive complexity (adaptivity) measures the number of sequential rounds of $\text{poly}(n)$ function queries an algorithm requires. The state-of-the-art algorithms can achieve $(1-\frac{1}{e}-\varepsilon)$-approximate solutions with $O(\frac{1}{\varepsilon^2}\log n)$ adaptivity, which approaches the known adaptivity lower-bounds. However, the $O(\frac{1}{\varepsilon^2} \log n)$ adaptivity only applies to maximizing worst-case functions that are unlikely to appear in practice. Thus, in this paper, we consider the special class of $p$-superseparable submodular functions, which places a reasonable constraint on $f$, based on the parameter $p$, and is more amenable to maximization, while also having real-world applicability. Our main contribution is the algorithm LS+GS, a finer-grained version of the existing LS+PGB algorithm, designed for instances of SMCC when $f$ is $p$-superseparable; it achieves an expected $(1-\frac{1}{e}-\varepsilon)$-approximate solution with $O(\frac{1}{\varepsilon^2}\log(p k))$ adaptivity independent of $n$. Additionally, unrelated to $p$-superseparability, our LS+GS algorithm uses only $O(\frac{n}{\varepsilon} + \frac{\log n}{\varepsilon^2})$ oracle queries, which has an improved dependence on $\varepsilon^{-1}$ over the state-of-the-art LS+PGB; this is achieved through the design of a novel thresholding subroutine.

翻译：摘要：在基数约束$k$下最大化包含$n$个元素的非负、单调、子模函数$f$（SMCC）是一个被广泛研究的NP难问题，在机器学习和影响力最大化等领域具有重要应用。尽管该理论问题存在多项式时间近似算法，但实际求解时往往需要频繁查询计算代价高昂的子模函数。这促使研究者们在自适应复杂度模型中设计并行近似算法；自适应复杂度（adaptivity）衡量算法所需的$\text{poly}(n)$次函数查询的序列轮数。现有最优算法能以$O(\frac{1}{\varepsilon^2}\log n)$的自适应性实现$(1-\frac{1}{e}-\varepsilon)$近似解，这已接近已知的自适应性下界。然而，$O(\frac{1}{\varepsilon^2}\log n)$的自适应性仅适用于实践中不太可能出现的最坏情况函数。因此，本文考虑一类特殊的$p$-超可分子模函数：这类函数基于参数$p$对$f$施加合理约束，更易于最大化，同时具有实际应用价值。我们的主要贡献是算法LS+GS——现有LS+PGB算法的细粒度版本，专门针对$f$为$p$-超可分时的SMCC实例设计；该算法能以$O(\frac{1}{\varepsilon^2}\log(p k))$的自适应性（与$n$无关）实现期望$(1-\frac{1}{e}-\varepsilon)$近似解。此外，与$p$-超可分性无关的是，我们的LS+GS算法仅需$O(\frac{n}{\varepsilon} + \frac{\log n}{\varepsilon^2})$次预言查询，在$\varepsilon^{-1}$的依赖关系上优于现有最优的LS+PGB算法——这是通过设计新型阈值子程序实现的。