Many methods based on sparse and low-rank representation been developed along with guarantees of correct outlier detection. Self-representation states that a point in a subspace can always be expressed as a linear combination of other points in the subspace. A suitable Markov Chain can be defined on the self-representation and it allows us to recognize the difference between inliers and outliers. However, the reconstruction error of self-representation that is still informative to detect outlier detection, is neglected.Inspired by the gradient boosting, in this paper, we propose a new outlier detection framework that combines a series of weak "outlier detectors" into a single strong one in an iterative fashion by constructing multi-pass self-representation. At each stage, we construct a self-representation based on elastic-net and define a suitable Markov Chain on it to detect outliers. The residual of the self-representation is used for the next stage to learn the next weaker outlier detector. Such a stage will repeat many times. And the final decision of outliers is generated by the previous all results. Experimental results on image and speaker datasets demonstrate its superiority with respect to state-of-the-art sparse and low-rank outlier detection methods.
翻译:基于稀疏和低秩表示的许多方法已发展起来,并伴随着对正确离群点检测的保证。自表示指出,子空间中的点总是可以表示为该子空间中其他点的线性组合。可以在自表示上定义一个合适的马尔可夫链,这使我们能够识别内点与离群点之间的差异。然而,自表示的重构误差(对离群点检测仍具有信息价值)却被忽视了。受梯度提升启发,本文提出了一种新的离群点检测框架,该框架通过构建多遍自表示,以迭代方式将一系列弱“离群点检测器”组合成一个强检测器。在每一阶段,我们基于弹性网络构建自表示,并定义合适的马尔可夫链以检测离群点。该自表示的残差用于下一阶段,以学习下一个较弱的离群点检测器。如此阶段将重复多次。最终离群点的判定由之前所有结果生成。在图像和语音数据集上的实验结果表明,该方法相较于最先进的稀疏和低秩离群点检测方法具有优越性。