Detection of the outliers is pivotal for any machine learning model deployed and operated in real-world. It is essential for the Deep Neural Networks that were shown to be overconfident with such inputs. Moreover, even deep generative models that allow estimation of the probability density of the input fail in achieving this task. In this work, we concentrate on the specific type of these models: Variational Autoencoders (VAEs). First, we unveil a significant theoretical flaw in the assumption of the classical VAE model. Second, we enforce an accommodating topological property to the image of the deep neural mapping to the latent space: compactness to alleviate the flaw and obtain the means to provably bound the image within the determined limits by squeezing both inliers and outliers together. We enforce compactness using two approaches: (i) Alexandroff extension and (ii) fixed Lipschitz continuity constant on the mapping of the encoder of the VAEs. Finally and most importantly, we discover that the anomalous inputs predominantly tend to land on the vacant latent holes within the compact space, enabling their successful identification. For that reason, we introduce a specifically devised score for hole detection and evaluate the solution against several baseline benchmarks achieving promising results.
翻译:异常检测对于任何部署和实际运行的机器学习模型都至关重要。对于深度神经网络而言,这类输入会使其表现出过度自信,因此该检测尤为关键。此外,即使能够估计输入概率密度的深度生成模型,也无法完成这一任务。本文聚焦于这类模型中的特定类型:变分自编码器(VAEs)。首先,我们揭示了经典VAE模型假设中一个显著的理论缺陷。其次,我们向深度神经网络映射至潜在空间的图像强加了一种可适应的拓扑性质:紧致性,以缓解上述缺陷,并通过将内点和离群点共同压缩,获得在确定界限内对图像进行可证明约束的方法。我们采用两种途径实现紧致性:(i)亚历山德罗夫延拓法和(ii)在VAE编码器映射上施加固定利普希茨连续性常数。最终且最重要的是,我们发现异常输入主要倾向于落在紧致空间内的潜在空洞区域,从而能够成功识别它们。为此,我们引入了一种专门设计的空洞检测评分,并在多个基准测试中评估了该方案,取得了有前景的结果。