Anomaly detection in imbalanced datasets is a frequent and crucial problem, especially in the medical domain where retrieving and labeling irregularities is often expensive. By combining the generative stability of a $\beta$-variational autoencoder (VAE) with the discriminative strengths of generative adversarial networks (GANs), we propose a novel model, $\beta$-VAEGAN. We investigate methods for composing anomaly scores based on the discriminative and reconstructive capabilities of our model. Existing work focuses on linear combinations of these components to determine if data is anomalous. We advance existing work by training a kernelized support vector machine (SVM) on the respective error components to also consider nonlinear relationships. This improves anomaly detection performance, while allowing faster optimization. Lastly, we use the deviations from the Gaussian prior of $\beta$-VAEGAN to form a novel anomaly score component. In comparison to state-of-the-art work, we improve the $F_1$ score during anomaly detection from 0.85 to 0.92 on the widely used MITBIH Arrhythmia Database.
翻译:在不平衡数据集中进行异常检测是一个常见且关键的问题,尤其在医学领域,检索和标注异常数据往往成本高昂。通过将$\beta$-变分自编码器(VAE)的生成稳定性与生成对抗网络(GAN)的判别优势相结合,我们提出了一种新型模型$\beta$-VAEGAN。我们研究了基于模型判别和重建能力构建异常分数的方法。现有工作主要关注这些分量的线性组合以判断数据是否异常。我们通过训练核化支持向量机(SVM)处理各误差分量,同时考虑非线性关系,从而推进了现有研究。这提高了异常检测性能,同时实现了更快的优化。最后,我们利用$\beta$-VAEGAN的高斯先验偏差形成一种新型异常分数分量。与最先进的工作相比,我们在广泛使用的MIT-BIH心律失常数据库上将异常检测的$F_1$分数从0.85提升至0.92。