In an era of widespread web scraping, unlearnable dataset methods have the potential to protect data privacy by preventing deep neural networks from generalizing. But in addition to a number of practical limitations that make their use unlikely, we make a number of findings that call into question their ability to safeguard data. First, it is widely believed that neural networks trained on unlearnable datasets only learn shortcuts, simpler rules that are not useful for generalization. In contrast, we find that networks actually can learn useful features that can be reweighed for high test performance, suggesting that image protection is not assured. Unlearnable datasets are also believed to induce learning shortcuts through linear separability of added perturbations. We provide a counterexample, demonstrating that linear separability of perturbations is not a necessary condition. To emphasize why linearly separable perturbations should not be relied upon, we propose an orthogonal projection attack which allows learning from unlearnable datasets published in ICML 2021 and ICLR 2023. Our proposed attack is significantly less complex than recently proposed techniques.
翻译:在网络爬虫泛滥的时代,不可学习数据集方法有潜力通过阻止深度神经网络泛化来保护数据隐私。但除了若干实际限制使其难以应用外,我们发现多项事实质疑其保护数据的能力。首先,普遍观点认为在不可学习数据集上训练的神经网络仅学习捷径——即无助于泛化的简单规则。与此相反,我们发现网络实际上能够学习到可通过重新加权实现高测试性能的有用特征,这表明图像保护并非万无一失。此外,不可学习数据集被认为通过添加扰动的线性可分性来诱导学习捷径。我们提供了反例,证明扰动的线性可分性并非必要条件。为强调不应依赖线性可分扰动,我们提出一种正交投影攻击方法,可攻破发表于ICML 2021和ICLR 2023的不可学习数据集。相较于近期提出的技术,我们提出的攻击方法复杂度显著降低。