One-class classification (OCC) is the problem of deciding whether an observed sample belongs to a target class. We consider the problem of learning an OCC model that performs as the generalized likelihood ratio test (GLRT), given a dataset containing samples of the target class. The GLRT solves the same problem when the statistics of the target class are available. The GLRT is a well-known and provably optimal (under specific assumptions) classifier. To this end, we consider both the multilayer perceptron neural network (NN) and the support vector machine (SVM) models. They are trained as two-class classifiers using an artificial dataset for the alternative class, obtained by generating random samples, uniformly over the domain of the target-class dataset. We prove that, under suitable assumptions, the models converge (with a large dataset) to the GLRT. Moreover, we show that the one-class least squares SVM (OCLSSVM) with suitable kernels at convergence performs as the GLRT. Lastly, we prove that the widely used autoencoder (AE) classifier does not generally provide the GLRT.
翻译:单类分类(OCC)是判断观测样本是否属于目标类别的问题。我们研究了在给定目标类别样本数据集的情况下,学习一种能够模拟广义似然比检验(GLRT)性能的OCC模型。当目标类别的统计特性已知时,GLRT可解决相同问题,且该检验是公认的(在特定假设下)最优分类器。为此,我们分别考虑了多层感知器神经网络(NN)和支持向量机(SVM)模型。通过生成在目标类别数据集域上均匀分布的随机样本,构建替代类别的合成数据集,并将模型作为二类分类器进行训练。我们证明,在适当假设下,这些模型(在大量数据条件下)将收敛于GLRT。此外,我们表明,采用适当核函数的单类最小二乘SVM(OCLSSVM)在收敛时性能等同于GLRT。最后,我们证明广泛使用的自编码器(AE)分类器通常无法实现GLRT性能。