MedLocker: A Transferable Adversarial Watermarking for Preventing Unauthorized Analysis of Medical Image Dataset

The collection of medical image datasets is a demanding and laborious process that requires significant resources. Furthermore, these medical datasets may contain personally identifiable information, necessitating measures to ensure that unauthorized access is prevented. Failure to do so could violate the intellectual property rights of the dataset owner and potentially compromise the privacy of patients. As a result, safeguarding medical datasets and preventing unauthorized usage by AI diagnostic models is a pressing challenge. To address this challenge, we propose a novel visible adversarial watermarking method for medical image copyright protection, called MedLocker. Our approach involves continuously optimizing the position and transparency of a watermark logo, which reduces the performance of the target model, leading to incorrect predictions. Importantly, we ensure that our method minimizes the impact on clinical visualization by constraining watermark positions using semantical masks (WSM), which are bounding boxes of lesion regions based on semantic segmentation. To ensure the transferability of the watermark across different models, we verify the cross-model transferability of the watermark generated on a single model. Additionally, we generate a unique watermark parameter list each time, which can be used as a certification to verify the authorization. We evaluate the performance of MedLocker on various mainstream backbones and validate the feasibility of adversarial watermarking for copyright protection on two widely-used diabetic retinopathy detection datasets. Our results demonstrate that MedLocker can effectively protect the copyright of medical datasets and prevent unauthorized users from analyzing medical images with AI diagnostic models.

翻译：医学图像数据集的采集是一个需要大量资源、要求严格且耗费人力的过程。此外，这些医学数据集可能包含个人身份信息，因此需要采取措施防止未授权访问。若未能做到这一点，可能侵犯数据集所有者的知识产权，并可能危及患者的隐私。因此，保护医学数据集并防止其被AI诊断模型未授权使用是一项紧迫的挑战。为应对这一挑战，我们提出了一种新颖的用于医学图像版权保护的可见对抗性水印方法，称为MedLocker。我们的方法涉及持续优化水印标识的位置和透明度，从而降低目标模型的性能，导致其产生错误预测。重要的是，我们通过使用语义掩码（WSM）——即基于语义分割的病变区域边界框——来约束水印位置，从而确保该方法对临床可视化的影响最小化。为确保水印在不同模型间的可迁移性，我们验证了在单一模型上生成的水印的跨模型迁移能力。此外，我们每次都会生成一个独特的水印参数列表，该列表可用作验证授权的凭证。我们在多种主流骨干网络上评估了MedLocker的性能，并在两个广泛使用的糖尿病视网膜病变检测数据集上验证了对抗性水印用于版权保护的可行性。我们的结果表明，MedLocker能有效保护医学数据集的版权，并防止未授权用户使用AI诊断模型分析医学图像。

相关内容

数据集

关注 88

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

【ICDM 2022教程】图挖掘中的公平性:度量、算法和应用

专知会员服务

28+阅读 · 2022年12月26日

【CVPR 2022】长尾视觉数据识别的嵌套式协同学习方法 Nested Collaborative Learning for Long-Tailed Visual Recognition

专知会员服务

13+阅读 · 2022年3月19日

【牛津大学】电子医疗记录的生成式对抗网络:应用、评估措施和数据来源综述，A review of Generative Adversarial Networks for Electronic Health Records: applications, evaluation measures and data sources

专知会员服务

24+阅读 · 2022年3月15日

【CVPR 2022】可转移的稀疏对抗性攻击，Transferable Sparse Adversarial Attack

专知会员服务

15+阅读 · 2022年3月12日