DWBench：数据集版权审计水印技术的系统性评估 (DWBench: Holistic Evaluation of Watermark for Dataset Copyright Auditing)

The surging demand for large-scale datasets in deep learning has heightened the need for effective copyright protection, given the risks of unauthorized use to data owners. Although the dataset watermark technique holds promise for auditing and verifying usage, existing methods are hindered by inconsistent evaluations, which impede fair comparisons and assessments of real-world viability. To address this gap, we propose a two-layer taxonomy that categorizes methods by implementation (model-based vs. model-free injection; model-behavior vs. model-message verification), offering a structured framework for cross-task analysis. Then, we develop DWBench, a unified benchmark and open-source toolkit for systematically evaluating image dataset watermark techniques in classification and generation tasks. Using DWBench, we assess 25 representative methods under standardized conditions, perturbation-based robustness tests, multi-watermark coexistence, and multi-user interference. In addition to reporting the results of four commonly used metrics, we present the results of two new metrics: sample significance for fine-grained watermark distinguishability and verification success rate for dataset-level auditing, which enable accurate and reproducible benchmarking. Key findings reveal inherent trade-offs: no single method dominates all scenarios; classification and generation tasks require specialized approaches; and existing techniques exhibit instability at low watermark rates and in realistic multi-user settings, with elevated false positives or performance declines. We hope that DWBench can facilitate advances in watermark reliability and practicality, thus strengthening copyright safeguards in the face of widespread AI-driven data exploitation.

翻译：深度学习对大规模数据集需求的激增，加剧了数据所有者面临未授权使用的风险，从而提升了对有效版权保护的需求。尽管数据集水印技术在审计与使用验证方面前景广阔，但现有方法受限于不一致的评估体系，阻碍了公平比较及其实用可行性的客观评判。为填补这一空白，我们提出一种双层分类法，依据实现方式（基于模型与无模型注入；模型行为与模型消息验证）对方法进行分类，为跨任务分析提供结构化框架。基于此，我们开发了DWBench——一个用于系统评估图像数据集水印技术在分类与生成任务中表现的统一基准框架及开源工具包。利用DWBench，我们在标准化条件下评估了25种代表性方法，涵盖基于扰动的鲁棒性测试、多水印共存及多用户干扰场景。除报告四项常用指标的结果外，我们首次引入两项新指标的评估结果：用于细粒度水印区分度的样本显著性指标，以及用于数据集级审计的验证成功率指标，从而实现精确且可复现的基准测试。关键发现揭示了固有的权衡关系：不存在适用于所有场景的单一优势方法；分类与生成任务需要专门化方案；现有技术在低水印注入率及实际多用户场景中表现出不稳定性，伴随误报率上升或性能下降。我们期望DWBench能够推动水印技术可靠性及实用性的进步，从而在人工智能驱动的数据利用日益普及的背景下，强化版权保护机制。

相关内容

数据集

关注 88

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

《深度伪造防御系统评估的系统性方法》

专知会员服务

14+阅读 · 3月16日

面向 AI 生成图像的安全与鲁棒水印：全面综述

专知会员服务

14+阅读 · 2025年10月6日

扩散模型时代的可视水印：进展与挑战

专知会员服务

7+阅读 · 2025年5月17日

ACM Computing Surveys | 港大等基于可靠性视角的深度伪造检测综述，覆盖主流基准库、模型

专知会员服务

17+阅读 · 2025年1月12日