Label-Only Model Inversion Attacks via Knowledge Transfer

In a model inversion (MI) attack, an adversary abuses access to a machine learning (ML) model to infer and reconstruct private training data. Remarkable progress has been made in the white-box and black-box setups, where the adversary has access to the complete model or the model's soft output respectively. However, there is very limited study in the most challenging but practically important setup: Label-only MI attacks, where the adversary only has access to the model's predicted label (hard label) without confidence scores nor any other model information. In this work, we propose LOKT, a novel approach for label-only MI attacks. Our idea is based on transfer of knowledge from the opaque target model to surrogate models. Subsequently, using these surrogate models, our approach can harness advanced white-box attacks. We propose knowledge transfer based on generative modelling, and introduce a new model, Target model-assisted ACGAN (T-ACGAN), for effective knowledge transfer. Our method casts the challenging label-only MI into the more tractable white-box setup. We provide analysis to support that surrogate models based on our approach serve as effective proxies for the target model for MI. Our experiments show that our method significantly outperforms existing SOTA Label-only MI attack by more than 15% across all MI benchmarks. Furthermore, our method compares favorably in terms of query budget. Our study highlights rising privacy threats for ML models even when minimal information (i.e., hard labels) is exposed. Our study highlights rising privacy threats for ML models even when minimal information (i.e., hard labels) is exposed. Our code, demo, models and reconstructed data are available at our project page: https://ngoc-nguyen-0.github.io/lokt/

翻译：在模型逆向攻击中，攻击者滥用对机器学习模型的访问权限，推断并重建私有训练数据。在白盒和黑盒场景下（攻击者分别可访问完整模型或模型的软输出），相关研究已取得显著进展。然而，在最具挑战性且实际重要的场景——标签唯一模型逆向攻击（攻击者仅能获取模型预测的硬标签，无置信度分数或其他模型信息）中，研究仍十分有限。本文提出LOKT方法，一种针对标签唯一模型逆向攻击的新颖方案。其核心思想是将不透明目标模型的知识迁移至替代模型，进而利用这些替代模型实施先进的白盒攻击。我们设计了基于生成建模的知识迁移机制，并引入新型模型——目标模型辅助的ACGAN（T-ACGAN）以实现高效知识迁移。该方法将具有挑战性的标签唯一逆向攻击转化为更易处理的白盒场景。理论分析表明，基于本方法构建的替代模型可作为目标模型的有效代理用于模型逆向攻击。实验结果显示，在所有模型逆向基准测试中，本方法显著超越现有最优标签唯一逆向攻击方法，性能提升超过15%。此外，本方法在查询预算方面也更具优势。本研究揭示了即使仅暴露最少信息（如硬标签），机器学习模型仍面临日益严峻的隐私威胁。代码、演示、模型及重构数据已公开于项目页面：https://ngoc-nguyen-0.github.io/lokt/

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日