Semi-supervised learning (SSL) has achieved remarkable performance with a small fraction of labeled data by leveraging vast amounts of unlabeled data from the Internet. However, this large pool of untrusted data is extremely vulnerable to data poisoning, leading to potential backdoor attacks. Current backdoor defenses are not yet effective against such a vulnerability in SSL. In this study, we propose a novel method, Unlabeled Data Purification (UPure), to disrupt the association between trigger patterns and target classes by introducing perturbations in the frequency domain. By leveraging the Rate-Distortion-Perception (RDP) trade-off, we further identify the frequency band, where the perturbations are added, and justify this selection. Notably, UPure purifies poisoned unlabeled data without the need of extra clean labeled data. Extensive experiments on four benchmark datasets and five SSL algorithms demonstrate that UPure effectively reduces the attack success rate from 99.78% to 0% while maintaining model accuracy. Code is available here: \url{https://github.com/chengyi-chris/UPure}.
翻译:半监督学习(SSL)通过利用互联网上的海量未标记数据,仅用少量标记数据便取得了显著性能。然而,这一庞大的不可信数据池极易受到数据投毒攻击,从而导致潜在的后门威胁。当前的后门防御方法对于SSL中的此类漏洞尚不有效。在本研究中,我们提出了一种新颖的方法——未标记数据净化(UPure),通过在频域中引入扰动来破坏触发模式与目标类别之间的关联。通过利用率-失真-感知(RDP)权衡,我们进一步确定了添加扰动的频带,并论证了该选择的合理性。值得注意的是,UPure无需额外的干净标记数据即可净化被投毒的未标记数据。在四个基准数据集和五种SSL算法上进行的大量实验表明,UPure能有效将攻击成功率从99.78%降低至0%,同时保持模型精度。代码可在此处获取:\url{https://github.com/chengyi-chris/UPure}。