ORL-AUDITOR: Dataset Auditing in Offline Deep Reinforcement Learning

Data is a critical asset in AI, as high-quality datasets can significantly improve the performance of machine learning models. In safety-critical domains such as autonomous vehicles, offline deep reinforcement learning (offline DRL) is frequently used to train models on pre-collected datasets, as opposed to training these models by interacting with the real-world environment as the online DRL. To support the development of these models, many institutions make datasets publicly available with opensource licenses, but these datasets are at risk of potential misuse or infringement. Injecting watermarks to the dataset may protect the intellectual property of the data, but it cannot handle datasets that have already been published and is infeasible to be altered afterward. Other existing solutions, such as dataset inference and membership inference, do not work well in the offline DRL scenario due to the diverse model behavior characteristics and offline setting constraints. In this paper, we advocate a new paradigm by leveraging the fact that cumulative rewards can act as a unique identifier that distinguishes DRL models trained on a specific dataset. To this end, we propose ORL-AUDITOR, which is the first trajectory-level dataset auditing mechanism for offline RL scenarios. Our experiments on multiple offline DRL models and tasks reveal the efficacy of ORL-AUDITOR, with auditing accuracy over 95% and false positive rates less than 2.88%. We also provide valuable insights into the practical implementation of ORL-AUDITOR by studying various parameter settings. Furthermore, we demonstrate the auditing capability of ORL-AUDITOR on open-source datasets from Google and DeepMind, highlighting its effectiveness in auditing published datasets. ORL-AUDITOR is open-sourced at https://github.com/link-zju/ORL-Auditor.

翻译：数据是人工智能领域的关键资产，高质量数据集能显著提升机器学习模型性能。在自动驾驶等安全关键领域，与在线深度强化学习通过与环境交互训练模型不同，离线深度强化学习（离线DRL）常基于预采集数据集进行模型训练。为支持模型研发，诸多机构以开源许可形式公开数据集，但这些数据面临潜在滥用或侵权的风险。向数据集中注入水印虽可保护知识产权，却无法处理已公开发布且事后不可修改的数据集。现有方案如数据集推断和成员推断，因模型行为特征多样性和离线设置约束，在离线DRL场景中效果欠佳。本文提出一种新范式：利用累积奖励可作为区分特定数据集上训练的DRL模型的唯一标识符。基于此，我们提出ORL-AUDITOR——首个面向离线RL场景的轨迹级数据集审计机制。在多个离线DRL模型和任务上的实验表明，ORL-AUDITOR的审计准确率超过95%，假阳性率低于2.88%。通过研究不同参数设置，我们为ORL-AUDITOR的实际部署提供了重要见解。此外，我们还在Google和DeepMind的开源数据集上验证了ORL-AUDITOR的审计能力，凸显其对已发布数据集的审计效果。ORL-AUDITOR已开源至https://github.com/link-zju/ORL-Auditor。

相关内容

数据集

关注 88

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日