Data is a critical asset in AI, as high-quality datasets can significantly improve the performance of machine learning models. In safety-critical domains such as autonomous vehicles, offline deep reinforcement learning (offline DRL) is frequently used to train models on pre-collected datasets, as opposed to training these models by interacting with the real-world environment as the online DRL. To support the development of these models, many institutions make datasets publicly available with opensource licenses, but these datasets are at risk of potential misuse or infringement. Injecting watermarks to the dataset may protect the intellectual property of the data, but it cannot handle datasets that have already been published and is infeasible to be altered afterward. Other existing solutions, such as dataset inference and membership inference, do not work well in the offline DRL scenario due to the diverse model behavior characteristics and offline setting constraints. In this paper, we advocate a new paradigm by leveraging the fact that cumulative rewards can act as a unique identifier that distinguishes DRL models trained on a specific dataset. To this end, we propose ORL-AUDITOR, which is the first trajectory-level dataset auditing mechanism for offline RL scenarios. Our experiments on multiple offline DRL models and tasks reveal the efficacy of ORL-AUDITOR, with auditing accuracy over 95% and false positive rates less than 2.88%. We also provide valuable insights into the practical implementation of ORL-AUDITOR by studying various parameter settings. Furthermore, we demonstrate the auditing capability of ORL-AUDITOR on open-source datasets from Google and DeepMind, highlighting its effectiveness in auditing published datasets. ORL-AUDITOR is open-sourced at https://github.com/link-zju/ORL-Auditor.
翻译:数据是人工智能领域的关键资产,高质量数据集能显著提升机器学习模型性能。在自动驾驶等安全关键领域,与在线深度强化学习通过与环境交互训练模型不同,离线深度强化学习(离线DRL)常基于预采集数据集进行模型训练。为支持模型研发,诸多机构以开源许可形式公开数据集,但这些数据面临潜在滥用或侵权的风险。向数据集中注入水印虽可保护知识产权,却无法处理已公开发布且事后不可修改的数据集。现有方案如数据集推断和成员推断,因模型行为特征多样性和离线设置约束,在离线DRL场景中效果欠佳。本文提出一种新范式:利用累积奖励可作为区分特定数据集上训练的DRL模型的唯一标识符。基于此,我们提出ORL-AUDITOR——首个面向离线RL场景的轨迹级数据集审计机制。在多个离线DRL模型和任务上的实验表明,ORL-AUDITOR的审计准确率超过95%,假阳性率低于2.88%。通过研究不同参数设置,我们为ORL-AUDITOR的实际部署提供了重要见解。此外,我们还在Google和DeepMind的开源数据集上验证了ORL-AUDITOR的审计能力,凸显其对已发布数据集的审计效果。ORL-AUDITOR已开源至https://github.com/link-zju/ORL-Auditor。