We propose a new approach to non-parametric density estimation that is based on regularizing a Sobolev norm of the density. This method is statistically consistent, and makes the inductive bias of the model clear and interpretable. While there is no closed analytic form for the associated kernel, we show that one can approximate it using sampling. The optimization problem needed to determine the density is non-convex, and standard gradient methods do not perform well. However, we show that with an appropriate initialization and using natural gradients, one can obtain well performing solutions. Finally, while the approach provides pre-densities (i.e. not necessarily integrating to 1), which prevents the use of log-likelihood for cross validation, we show that one can instead adapt Fisher divergence based score matching methods for this task. We evaluate the resulting method on the comprehensive recent anomaly detection benchmark suite, ADBench, and find that it ranks second best, among more than 15 algorithms.
翻译:我们提出一种基于正则化密度索伯列夫范数的非参数密度估计新方法。该方法具有统计一致性,并使得模型的归纳偏差清晰可解释。尽管相关核函数不存在闭合解析形式,但我们证明可通过采样对其进行近似。确定密度所需的优化问题是非凸的,标准梯度方法表现不佳。然而,我们证明通过适当的初始化并使用自然梯度,可以获得良好的解。最后,虽然该方法提供的是预密度(即不必积分为1),这阻止了使用对数似然进行交叉验证,但我们证明可通过自适应基于Fisher散度的得分匹配方法来完成此任务。我们在最新综合性异常检测基准套件ADBench上评估了该方法,发现其在超过15种算法中排名第二。