Lossy Kernelization for (Implicit) Hitting Set Problems

We re-visit the complexity of kernelization for the $d$-Hitting Set problem. This is a classic problem in Parameterized Complexity, which encompasses several other of the most well-studied problems in this field, such as Vertex Cover, Feedback Vertex Set in Tournaments (FVST) and Cluster Vertex Deletion (CVD). In fact, $d$-Hitting Set encompasses any deletion problem to a hereditary property that can be characterized by a finite set of forbidden induced subgraphs. With respect to bit size, the kernelization complexity of $d$-Hitting Set is essentially settled: there exists a kernel with $O(k^d)$ bits ($O(k^d)$ sets and $O(k^{d-1})$ elements) and this it tight by the result of Dell and van Melkebeek [STOC 2010, JACM 2014]. Still, the question of whether there exists a kernel for $d$-Hitting Set with fewer elements has remained one of the most major open problems~in~Kernelization. In this paper, we first show that if we allow the kernelization to be lossy with a qualitatively better loss than the best possible approximation ratio of polynomial time approximation algorithms, then one can obtain kernels where the number of elements is linear for every fixed $d$. Further, based on this, we present our main result: we show that there exist approximate Turing kernelizations for $d$-Hitting Set that even beat the established bit-size lower bounds for exact kernelizations -- in fact, we use a constant number of oracle calls, each with ``near linear'' ($O(k^{1+\epsilon})$) bit size, that is, almost the best one could hope for. Lastly, for two special cases of implicit 3-Hitting set, namely, FVST and CVD, we obtain the ``best of both worlds'' type of results -- $(1+\epsilon)$-approximate kernelizations with a linear number of vertices. In terms of size, this substantially improves the exact kernels of Fomin et al. [SODA 2018, TALG 2019], with simpler arguments.

翻译：我们重新审视了$d$-击中集问题的核化复杂度。这是参数化复杂度领域中的一个经典问题，涵盖了该领域中其他几个最深入研究的问题，如顶点覆盖、锦标赛中的反馈顶点集（FVST）和簇顶点删除（CVD）。事实上，$d$-击中集涵盖了任何针对可遗传性质的删除问题，该性质可由有限个禁止诱导子图刻画。在位大小方面，$d$-击中集的核化复杂度已基本解决：存在一个大小为$O(k^d)$比特的核（包含$O(k^d)$个集合和$O(k^{d-1})$个元素），且根据Dell和van Melkebeek [STOC 2010, JACM 2014]的结果，这一界限是紧的。然而，是否存在元素更少的$d$-击中集核这一问题，仍然是核化领域最重要的开放问题之一。在本文中，我们首先证明：若允许核化具有有损性，且其损失在质量上优于多项式时间近似算法所能达到的最佳近似比，则对于每个固定的$d$，可获得元素数量为线性的核。进一步，基于此，我们提出主要结果：我们证明了存在针对$d$-击中集的近似图灵核化，甚至打破了精确核化的位大小下界——实际上，我们使用常数次预言机调用，每次调用具有“近线性”（$O(k^{1+\epsilon})$）比特大小，即几乎达到了最佳可能。最后，对于隐式3-击中集的两个特例——FVST和CVD——我们获得了“两全其美”类型的结果：具有线性顶点数的$(1+\epsilon)$-近似核化。在规模方面，这显著改进了Fomin等人 [SODA 2018, TALG 2019]的精确核，且论证更简洁。