Relying only on unlabeled data, Self-supervised learning (SSL) can learn rich features in an economical and scalable way. As the drive-horse for building foundation models, SSL has received a lot of attention recently with wide applications, which also raises security concerns where backdoor attack is a major type of threat: if the released dataset is maliciously poisoned, backdoored SSL models can behave badly when triggers are injected to test samples. The goal of this work is to investigate this potential risk. We notice that existing backdoors all require a considerable amount of \emph{labeled} data that may not be available for SSL. To circumvent this limitation, we explore a more restrictive setting called no-label backdoors, where we only have access to the unlabeled data alone, where the key challenge is how to select the proper poison set without using label information. We propose two strategies for poison selection: clustering-based selection using pseudolabels, and contrastive selection derived from the mutual information principle. Experiments on CIFAR-10 and ImageNet-100 show that both no-label backdoors are effective on many SSL methods and outperform random poisoning by a large margin. Code will be available at https://github.com/PKU-ML/nlb.
翻译:仅依赖无标签数据,自监督学习(SSL)能够以经济且可扩展的方式学习丰富的特征。作为构建基础模型的驱动力,SSL 近期因广泛应用而备受关注,这也引发了安全担忧——其中后门攻击是主要威胁类型:若发布的数据集被恶意投毒,当触发器注入测试样本时,被植入后门的 SSL 模型可能表现异常。本文旨在研究这一潜在风险。我们注意到,现有后门攻击均需要大量可能无法用于 SSL 的 *有标签* 数据。为突破此限制,我们探索了一种更具约束性的设置——无标签后门,即仅能访问无标签数据,其核心挑战是如何在不使用标签信息的情况下选择合适的投毒集。我们提出两种投毒选择策略:基于伪标签的聚类选择方法,以及源自互信息原理的对比选择方法。在 CIFAR-10 和 ImageNet-100 上的实验表明,两种无标签后门在多种 SSL 方法中均有效,且性能大幅优于随机投毒。代码将发布于 https://github.com/PKU-ML/nlb。