In an era of widespread web scraping, unlearnable dataset methods have the potential to protect data privacy by preventing deep neural networks from generalizing. But in addition to a number of practical limitations that make their use unlikely, we make a number of findings that call into question their ability to safeguard data. First, it is widely believed that neural networks trained on unlearnable datasets only learn shortcuts, simpler rules that are not useful for generalization. In contrast, we find that networks actually can learn useful features that can be reweighed for high test performance, suggesting that image protection is not assured. Unlearnable datasets are also believed to induce learning shortcuts through linear separability of added perturbations. We provide a counterexample, demonstrating that linear separability of perturbations is not a necessary condition. To emphasize why linearly separable perturbations should not be relied upon, we propose an orthogonal projection attack which allows learning from unlearnable datasets published in ICML 2021 and ICLR 2023. Our proposed attack is significantly less complex than recently proposed techniques.
翻译:在网络数据大规模抓取的时代,不可学习数据集方法通过阻止深度神经网络泛化,具有保护数据隐私的潜力。然而,除了实际应用中的诸多限制外,我们的多项发现对其保护数据的能力提出了质疑。首先,普遍观点认为在不可学习数据集上训练的神经网络仅能学习捷径规则——这些简化的规律对泛化毫无用处。与之相反,我们发现网络实际上能够学习可被重新加权以实现高测试性能的有用特征,这表明图像保护并非万无一失。其次,不可学习数据集被认为通过添加扰动的线性可分性诱导捷径学习。我们给出了反例,证明扰动的线性可分性并非必要条件。为强调线性可分扰动不可依赖,我们提出了一种正交投影攻击方法,可对发表于ICML 2021和ICLR 2023的不可学习数据集实现有效学习。相较于近期提出的技术,我们的攻击方法复杂度显著降低。