We consider the clustering aggregation problem in which we are given a set of clusterings and want to find an aggregated clustering which minimizes the sum of mismatches to the input clusterings. In the binary case (each clustering is a bipartition) this problem was known to be NP-hard under Turing reductions. We strengthen this result by providing a polynomial-time many-one reduction. Our result also implies that no $2^{o(n)}\cdot |I'|^{O(1)}$-time algorithm exists that solves any given clustering instance $I'$ with $n$ elements, unless the \ETH{} fails. On the positive side, we show that the problem is fixed-parameter tractable with respect to the number of input clusterings and we give an integer linear programming formulation.
翻译:我们考虑聚类聚合问题:给定一组聚类,目标是找到一个聚合聚类,使得其与输入聚类的总失配数最小。在二值情形(每个聚类均为二划分)下,该问题已知在 Turing 归约下是 NP 难的。我们通过给出一个多项式时间的多项式归约加强了这一结论。我们的结果还表明,除非 \ETH{} 不成立,否则不存在一个时间复杂度为 $2^{o(n)}\cdot |I'|^{O(1)}$ 的算法能求解任意包含 $n$ 个元素的聚类实例 $I'$。在正面方面,我们证明该问题关于输入聚类数量是固定参数可解的,并给出了一个整数线性规划形式。