Self-supervised learning (SSL) is a growing torrent that has recently transformed machine learning and its many real world applications, by learning on massive amounts of unlabeled data via self-generated supervisory signals. Unsupervised anomaly detection (AD) has also capitalized on SSL, by self-generating pseudo-anomalies through various data augmentation functions or external data exposure. In this vision paper, we first underline the importance of the choice of SSL strategies on AD performance, by presenting evidences and studies from the AD literature. Equipped with the understanding that SSL incurs various hyperparameters (HPs) to carefully tune, we present recent developments on unsupervised model selection and augmentation tuning for SSL-based AD. We then highlight emerging challenges and future opportunities; on designing new pretext tasks and augmentation functions for different data modalities, creating novel model selection solutions for systematically tuning the SSL HPs, as well as on capitalizing on the potential of pretrained foundation models on AD through effective density estimation.
翻译:自监督学习(SSL)作为近年来蓬勃发展的技术浪潮,通过利用自生成监督信号在大量无标注数据上进行学习,已深刻改变了机器学习及其众多实际应用。无监督异常检测(AD)同样借助自监督学习,通过多种数据增强函数或外部数据暴露方式自生成伪异常样本。本文作为观点性论文,首先通过异常检测文献中的证据与研究,强调了SSL策略选择对AD性能的关键影响。在认识到SSL涉及多种需精细调节的超参数(HPs)后,我们介绍了基于SSL的异常检测在无监督模型选择与增强调优方面的最新进展。进而着重阐述了新兴挑战与未来机遇:为不同数据模态设计新型前置任务与增强函数、创建系统化调优SSL超参数的创新模型选择方案,以及通过有效密度估计充分发掘预训练基础模型在异常检测中的潜力。