The unauthorized use of personal data for commercial purposes and the clandestine acquisition of private data for training machine learning models continue to raise concerns. In response to these issues, researchers have proposed availability attacks that aim to render data unexploitable. However, many current attack methods are rendered ineffective by adversarial training. In this paper, we re-examine the concept of unlearnable examples and discern that the existing robust error-minimizing noise presents an inaccurate optimization objective. Building on these observations, we introduce a novel optimization paradigm that yields improved protection results with reduced computational time requirements. We have conducted extensive experiments to substantiate the soundness of our approach. Moreover, our method establishes a robust foundation for future research in this area.
翻译:个人数据被未经授权用于商业目的,以及秘密获取私有数据用于训练机器学习模型的问题持续引发担忧。为应对这些问题,研究人员提出了旨在使数据不可利用的可用性攻击。然而,当前许多攻击方法会因对抗训练而失效。本文重新审视了不可学习样本的概念,并指出现有的鲁棒误差最小化噪声存在不准确的优化目标。基于这些发现,我们引入了一种新的优化范式,能够在减少计算时间的同时实现更优的保护效果。我们通过大量实验验证了该方法的合理性。此外,我们的方法为这一领域的未来研究奠定了坚实基础。