Score matching is an alternative to maximum likelihood (ML) for estimating a probability distribution parametrized up to a constant of proportionality. By fitting the ''score'' of the distribution, it sidesteps the need to compute this constant of proportionality (which is often intractable). While score matching and variants thereof are popular in practice, precise theoretical understanding of the benefits and tradeoffs with maximum likelihood -- both computational and statistical -- are not well understood. In this work, we give the first example of a natural exponential family of distributions such that the score matching loss is computationally efficient to optimize, and has a comparable statistical efficiency to ML, while the ML loss is intractable to optimize using a gradient-based method. The family consists of exponentials of polynomials of fixed degree, and our result can be viewed as a continuous analogue of recent developments in the discrete setting. Precisely, we show: (1) Designing a zeroth-order or first-order oracle for optimizing the maximum likelihood loss is NP-hard. (2) Maximum likelihood has a statistical efficiency polynomial in the ambient dimension and the radius of the parameters of the family. (3) Minimizing the score matching loss is both computationally and statistically efficient, with complexity polynomial in the ambient dimension.
翻译:得分匹配是最大似然估计的一种替代方法,用于估计由比例常数参数化的概率分布。通过拟合分布的"得分",它避开了计算这个比例常数(通常是难以处理的)的需求。尽管得分匹配及其变体在实践中很受欢迎,但关于其与最大似然估计在计算和统计两方面的优势和权衡,尚无精确的理论理解。在这项工作中,我们首次给出了一个自然指数分布族的例子,使得得分匹配损失在计算上易于优化,且统计效率与最大似然相当,而最大似然损失则难以通过基于梯度的方法优化。该族由固定次数的多项式指数组成,我们的结果可以看作是离散设定中最新发展的连续类比。具体来说,我们证明:(1) 为优化最大似然损失设计零阶或一阶预言机是NP-hard的;(2) 最大似然具有一个关于环境维度和族参数半径的多项式统计效率;(3) 最小化得分匹配损失在计算和统计上都是高效的,其复杂度关于环境维度是多项式的。