Top-k-Convolution and the Quest for Near-Linear Output-Sensitive Subset Sum

In the classical Subset Sum problem we are given a set $X$ and a target $t$, and the task is to decide whether there exists a subset of $X$ which sums to $t$. A recent line of research has resulted in $\tilde{O}(t)$-time algorithms, which are (near-)optimal under popular complexity-theoretic assumptions. On the other hand, the standard dynamic programming algorithm runs in time $O(n \cdot |\mathcal{S}(X,t)|)$, where $\mathcal{S}(X,t)$ is the set of all subset sums of $X$ that are smaller than $t$. Furthermore, all known pseudopolynomial algorithms actually solve a stronger task, since they actually compute the whole set $\mathcal{S}(X,t)$. As the aforementioned two running times are incomparable, in this paper we ask whether one can achieve the best of both worlds: running time $\tilde{O}(|\mathcal{S}(X,t)|)$. In particular, we ask whether $\mathcal{S}(X,t)$ can be computed in near-linear time in the output-size. Using a diverse toolkit containing techniques such as color coding, sparse recovery, and sumset estimates, we make considerable progress towards this question and design an algorithm running in time $\tilde{O}(|\mathcal{S}(X,t)|^{4/3})$. Central to our approach is the study of top-$k$-convolution, a natural problem of independent interest: given sparse polynomials with non-negative coefficients, compute the lowest $k$ non-zero monomials of their product. We design an algorithm running in time $\tilde{O}(k^{4/3})$, by a combination of sparse convolution and sumset estimates considered in Additive Combinatorics. Moreover, we provide evidence that going beyond some of the barriers we have faced requires either an algorithmic breakthrough or possibly new techniques from Additive Combinatorics on how to pass from information on restricted sumsets to information on unrestricted sumsets.

翻译：在经典子集和问题中，给定一个集合$X$与目标值$t$，任务是判断是否存在$X的某个子集其元素之和等于t$。近期一系列研究得到了$\tilde{O}(t)$时间复杂度的算法，该复杂度在主流计算复杂性假设下是（近乎）最优的。另一方面，标准动态规划算法的时间复杂度为$O(n \cdot |\mathcal{S}(X,t)|)$，其中$\mathcal{S}(X,t)$表示$X$中所有小于$t$的子集和集合。值得注意的是，所有已知的伪多项式算法实际上解决了更强的任务——它们完整计算了整个集合$\mathcal{S}(X,t)$。由于上述两类运行时间互不可比，本文提出疑问：是否可以同时实现两者的优势，即达到$\tilde{O}(|\mathcal{S}(X,t)|)$的运行时间？具体而言，我们探究能否以输出规模的近线性时间完成$\mathcal{S}(X,t)$的计算。通过融合色彩编码、稀疏恢复与和集估计等多元技术工具，我们在此问题上取得重要进展，设计出时间复杂度为$\tilde{O}(|\mathcal{S}(X,t)|^{4/3})$的算法。本文方法的核心在于研究top-$k$-卷积问题——这是一个具有独立研究价值的自然问题：给定稀疏非负系数多项式，计算其乘积中非零系数最小的k个单项式。我们通过结合稀疏卷积与加性组合数学中的和集估计技术，提出了运行时间为$\tilde{O}(k^{4/3})$的算法。此外，我们论证了要突破当前面临的技术瓶颈，或需算法层面的重大突破，或需加性组合学开发出从受限和集信息到非受限和集信息转化的全新方法。