Activity Recognition on Avatar-Anonymized Datasets with Masked Differential Privacy

Privacy-preserving computer vision is an important emerging problem in machine learning and artificial intelligence. Prevalent methods tackling this problem use differential privacy (DP) or obfuscation techniques to protect the privacy of individuals. In both cases, the utility of the trained model is sacrificed heavily in this process. In this work, we present an anonymization pipeline that replaces sensitive human subjects in video datasets with synthetic avatars within context, employing a combined rendering and stable diffusion-based strategy. Additionally we propose masked differential privacy ({MaskDP}) to protect non-anonymized but privacy sensitive background information. MaskDP allows for controlling sensitive regions where differential privacy is applied, in contrast to applying DP on the entire input. This combined methodology provides strong privacy protection while minimizing the usual performance penalty of privacy preserving methods. Experiments on multiple challenging action recognition datasets demonstrate that our proposed techniques result in better utility-privacy trade-offs compared to standard differentially private training in the especially demanding $\epsilon<1$ regime.

翻译：隐私保护计算机视觉是机器学习和人工智能领域一个重要的新兴问题。当前解决该问题的主流方法采用差分隐私（DP）或混淆技术来保护个体隐私。在这两种情况下，训练模型的实用性在此过程中都受到严重损害。本研究提出了一种匿名化处理流程，通过结合渲染与基于稳定扩散的策略，将视频数据集中敏感的人类主体替换为符合上下文的合成虚拟化身。此外，我们提出了掩码差分隐私（MaskDP）以保护未匿名化但涉及隐私的背景信息。与在整个输入上应用DP不同，MaskDP允许控制在哪些敏感区域应用差分隐私。这种组合方法在提供强隐私保护的同时，最大限度地减少了隐私保护方法通常带来的性能损失。在多个具有挑战性的行为识别数据集上的实验表明，在隐私预算尤其严苛（ε<1）的情况下，我们提出的技术相比标准的差分隐私训练能实现更优的效用-隐私权衡。

相关内容

数据集

关注 88

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

《生成式模型: 变分自编码器与扩散模型》，75页ppt，Google DeepMind科学家Ruiqi Gao

专知会员服务

66+阅读 · 2023年6月10日

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日