As industrial applications are increasingly automated by machine learning models, enforcing personal data ownership and intellectual property rights requires tracing training data back to their rightful owners. Membership inference algorithms approach this problem by using statistical techniques to discern whether a target sample was included in a model's training set. However, existing methods only utilize the unaltered target sample or simple augmentations of the target to compute statistics. Such a sparse sampling of the model's behavior carries little information, leading to poor inference capabilities. In this work, we use adversarial tools to directly optimize for queries that are discriminative and diverse. Our improvements achieve significantly more accurate membership inference than existing methods, especially in offline scenarios and in the low false-positive regime which is critical in legal settings. Code is available at https://github.com/YuxinWenRick/canary-in-a-coalmine.
翻译:随着工业应用日益由机器学习模型自动化,强制执行个人数据所有权和知识产权需要将训练数据追溯至其合法所有者。成员推断算法通过统计技术辨别目标样本是否包含在模型的训练集中。然而,现有方法仅利用未经修改的目标样本或简单增强版本来计算统计数据。这种对模型行为的稀疏采样携带信息量有限,导致推断能力低下。在本工作中,我们利用对抗工具直接优化具有判别性和多样性的查询。与现有方法相比,我们的改进实现了显著更准确的成员推断,尤其在离线场景和法律场景中至关重要的低假阳性率条件下。代码见 https://github.com/YuxinWenRick/canary-in-a-coalmine。