Hardness of Maximum Likelihood Learning of DPPs

Determinantal Point Processes (DPPs) are a widely used probabilistic model for negatively correlated sets. DPPs have been successfully employed in Machine Learning applications to select a diverse, yet representative subset of data. In these applications, a set of parameters that maximize the likelihood of the data is typically desirable. The algorithms used for this task to date either optimize over a limited family of DPPs, or use local improvement heuristics that do not provide theoretical guarantees of optimality. In his seminal work on DPPs in Machine Learning, Kulesza (2011) conjectured that the problem is NP-complete. The lack of a formal proof prompted Brunel et al. (COLT 2017) to suggest that, in opposition to Kulesza's conjecture, there might exist a polynomial-time algorithm for computing a maximum-likelihood DPP. They also presented some preliminary evidence supporting a conjecture that they suggested might lead to such an algorithm. In this work we prove Kulesza's conjecture. In fact, we prove the following stronger hardness of approximation result: even computing a $\left(1-O(\frac{1}{\log^9{N}})\right)$-approximation to the maximum log-likelihood of a DPP on a ground set of $N$ elements is NP-complete. From a technical perspective, we reduce the problem of approximating the maximum log-likelihood of a DPP to solving a gap instance of a \textsc{$3$-Coloring} problem on a hypergraph. This hypergraph is based on the bounded-degree construction of Bogdanov et al. (FOCS 2002), which we enhance using the strong expanders of Alon and Capalbo (FOCS 2007). We demonstrate that if a rank-$3$ DPP achieves near-optimal log-likelihood, its marginal kernel must encode an almost perfect ``vector-coloring" of the hypergraph. Finally, we show that these continuous vectors can be decoded into a proper $3$-coloring after removing a small fraction of ``noisy" edges.

翻译：行列式点过程（DPPs）是一种广泛使用的负相关集合概率模型。在机器学习应用中，DPPs已成功用于选择多样化且具代表性的数据子集。在这些应用中，通常需要找到使数据似然最大化的参数集。迄今为止，该任务所用算法要么局限于优化特定DPPs族，要么采用无法提供理论最优性保证的局部改进启发式方法。Kulesza（2011）在其关于DPPs的机器学习奠基性工作中，曾推测该问题是NP完全问题。由于缺乏形式化证明，Brunel等人（COLT 2017）提出与Kulesza猜想相反的可能性：可能存在计算最大似然DPP的多项式时间算法。他们还提供了支持该猜想的初步证据，认为这可能导向此类算法。本文证明了Kulesza猜想。实际上，我们证明了以下更强的近似难度结果：即使在包含$N$个元素的基础集上，计算DPP最大对数似然的$\left(1-O(\frac{1}{\log^9{N}})\right)$近似解也是NP完全问题。从技术角度看，我们将DPP最大对数似然近似问题归约为超图上的\textsc{$3$-着色}问题的间隙实例求解。该超图基于Bogdanov等人（FOCS 2002）的有界度构造，并利用Alon与Capalbo（FOCS 2007）的强扩展图进行增强。我们证明：若秩为$3$的DPP能达到接近最优的对数似然，其边缘核必须编码该超图的近似完美“向量着色”。最后，我们展示在移除少量“噪声”边后，这些连续向量可解码为正确的$3$-着色方案。