Today, creators of data-hungry deep neural networks (DNNs) scour the Internet for training fodder, leaving users with little control over or knowledge of when their data is appropriated for model training. To empower users to counteract unwanted data use, we design, implement and evaluate a practical system that enables users to detect if their data was used to train an DNN model. We show how users can create special data points we call isotopes, which introduce "spurious features" into DNNs during training. With only query access to a trained model and no knowledge of the model training process, or control of the data labels, a user can apply statistical hypothesis testing to detect if a model has learned the spurious features associated with their isotopes by training on the user's data. This effectively turns DNNs' vulnerability to memorization and spurious correlations into a tool for data provenance. Our results confirm efficacy in multiple settings, detecting and distinguishing between hundreds of isotopes with high accuracy. We further show that our system works on public ML-as-a-service platforms and larger models such as ImageNet, can use physical objects instead of digital marks, and remains generally robust against several adaptive countermeasures.
翻译:当今,渴求数据的深度神经网络(DNN)创建者从互联网上搜罗训练素材,导致用户对其数据何时被用于模型训练几乎无法控制或知晓。为赋予用户抵御未经授权的数据使用的能力,我们设计、实现并评估了一套实用系统,使用户能够检测自身数据是否被用于训练DNN模型。我们展示了用户如何创建称为"同位素"的特殊数据点,这些数据点在训练过程中会向DNN引入"伪特征"。用户仅需对训练后的模型进行查询访问,无需了解模型训练过程或控制数据标签,即可通过统计假设检验检测模型是否基于用户数据训练而学习到与同位素相关的伪特征。这实质上将DNN对记忆和伪相关性的脆弱性转化为数据溯源的利器。实验结果表明,该系统在多种场景下均具有效性,能够以高精度检测并区分数百种同位素。我们进一步证明,该系统可在公共机器学习服务平台及ImageNet等大型模型中运行,可采用物理对象替代数字标记,并且对多种自适应对抗措施保持总体鲁棒性。