Feature Inference Attack on Shapley Values

As a solution concept in cooperative game theory, Shapley value is highly recognized in model interpretability studies and widely adopted by the leading Machine Learning as a Service (MLaaS) providers, such as Google, Microsoft, and IBM. However, as the Shapley value-based model interpretability methods have been thoroughly studied, few researchers consider the privacy risks incurred by Shapley values, despite that interpretability and privacy are two foundations of machine learning (ML) models. In this paper, we investigate the privacy risks of Shapley value-based model interpretability methods using feature inference attacks: reconstructing the private model inputs based on their Shapley value explanations. Specifically, we present two adversaries. The first adversary can reconstruct the private inputs by training an attack model based on an auxiliary dataset and black-box access to the model interpretability services. The second adversary, even without any background knowledge, can successfully reconstruct most of the private features by exploiting the local linear correlations between the model inputs and outputs. We perform the proposed attacks on the leading MLaaS platforms, i.e., Google Cloud, Microsoft Azure, and IBM aix360. The experimental results demonstrate the vulnerability of the state-of-the-art Shapley value-based model interpretability methods used in the leading MLaaS platforms and highlight the significance and necessity of designing privacy-preserving model interpretability methods in future studies. To our best knowledge, this is also the first work that investigates the privacy risks of Shapley values.

翻译：作为合作博弈论中的一种解概念，Shapley值在模型可解释性研究中受到高度认可，并被Google、Microsoft和IBM等领先的机器学习即服务（MLaaS）提供商广泛采用。然而，尽管基于Shapley值的模型可解释性方法已得到深入研究，却鲜有研究者关注Shapley值可能引发的隐私风险，尽管可解释性与隐私性同为机器学习模型的两大基础。本文通过特征推断攻击——即根据Shapley值解释重构私有模型输入——来探究基于Shapley值的模型可解释性方法存在的隐私风险。具体而言，我们提出了两种攻击者模型。第一种攻击者可通过基于辅助数据集训练攻击模型，并结合对模型可解释性服务的黑盒访问来重构私有输入。第二种攻击者即使没有任何背景知识，也能通过利用模型输入与输出之间的局部线性相关性成功重构大部分私有特征。我们在领先的MLaaS平台（即Google Cloud、Microsoft Azure和IBM aix360）上实施了所提出的攻击。实验结果表明，当前主流MLaaS平台采用的先进基于Shapley值的模型可解释性方法存在脆弱性，这凸显了在未来研究中设计隐私保护型模型可解释性方法的重要性和必要性。据我们所知，本研究也是首个探究Shapley值隐私风险的工作。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日