Anomaly detection in real-world scenarios poses challenges due to dynamic and often unknown anomaly distributions, requiring robust methods that operate under an open-world assumption. This challenge is exacerbated in practical settings, where models are employed by private organizations, precluding data sharing due to privacy and competitive concerns. Despite potential benefits, the sharing of anomaly information across organizations is restricted. This paper addresses the question of enhancing outlier detection within individual organizations without compromising data confidentiality. We propose a novel method leveraging representation learning and federated learning techniques to improve the detection of unknown anomalies. Specifically, our approach utilizes latent representations obtained from client-owned autoencoders to refine the decision boundary of inliers. Notably, only model parameters are shared between organizations, preserving data privacy. The efficacy of our proposed method is evaluated on two standard financial tabular datasets and an image dataset for anomaly detection in a distributed setting. The results demonstrate a strong improvement in the classification of unknown outliers during the inference phase for each organization's model.
翻译:现实场景中的异常检测因动态且常未知的异常分布而面临挑战,这要求方法在开放世界假设下具备鲁棒性。在实际应用中,模型由私有组织部署,因隐私和竞争因素无法共享数据,进一步加剧了这一挑战。尽管共享异常信息可能带来潜在效益,但跨组织的异常信息共享仍受限制。本文探讨了如何在保障数据机密性的前提下提升单个组织的异常检测能力。我们提出了一种结合表示学习与联邦学习技术的新方法,用于改进对未知异常的检测能力。具体而言,该方法利用客户端自编码器获得的潜在表示来优化正常样本的决策边界。值得注意的是,各组织间仅共享模型参数,从而保护数据隐私。在分布式场景下,我们使用两个标准金融表格数据集和一个图像数据集对所提方法的异常检测效能进行了评估。结果表明,该方法在推理阶段显著提升了各组织模型对未知异常的分类能力。