Density estimation based anomaly detection schemes typically model anomalies as examples that reside in low-density regions. We propose a modified density estimation problem and demonstrate its effectiveness for anomaly detection. Specifically, we assume the density function of normal samples is uniform in some compact domain. This assumption implies the density function is more stable (with lower variance) around normal samples than anomalies. We first corroborate this assumption empirically using a wide range of real-world data. Then, we design a variance stabilized density estimation problem for maximizing the likelihood of the observed samples while minimizing the variance of the density around normal samples. We introduce an ensemble of autoregressive models to learn the variance stabilized distribution. Finally, we perform an extensive benchmark with 52 datasets demonstrating that our method leads to state-of-the-art results while alleviating the need for data-specific hyperparameter tuning.
翻译:基于密度估计的异常检测方案通常将异常建模为位于低密度区域的样本。我们提出了一种修正的密度估计问题,并证明了其在异常检测中的有效性。具体而言,我们假设正常样本的密度函数在某个紧致域内是均匀的。这一假设意味着密度函数在正常样本周围的波动(方差)比异常样本更小。我们首先通过广泛的实际数据从经验上验证了这一假设。随后,我们设计了一个方差稳定化密度估计问题,旨在最大化观测样本的似然函数,同时最小化正常样本周围密度的方差。我们引入了一个自回归模型集成来学习方差稳定化分布。最后,我们在52个数据集上进行了全面的基准测试,结果表明我们的方法达到了最优性能,同时无需针对特定数据进行超参数调优。