The increasing prominence of deep learning applications and reliance on personalized data underscore the urgent need to address privacy vulnerabilities, particularly Membership Inference Attacks (MIAs). Despite numerous MIA studies, significant knowledge gaps persist, particularly regarding the impact of hidden features (in isolation) on attack efficacy and insufficient justification for the root causes of attacks based on raw data features. In this paper, we aim to address these knowledge gaps by first exploring statistical approaches to identify the most informative neurons and quantifying the significance of the hidden activations from the selected neurons on attack accuracy, in isolation and combination. Additionally, we propose an attack-driven explainable framework by integrating the target and attack models to identify the most influential features of raw data that lead to successful membership inference attacks. Our proposed MIA shows an improvement of up to 26% on state-of-the-art MIA.
翻译:深度学习应用的日益普及以及对个性化数据的依赖,突显了解决隐私漏洞的迫切性,尤其是成员推理攻击(MIAs)。尽管已有大量MIA研究,但关键知识缺口依然存在,特别是关于隐藏特征(单独作用)对攻击效能的影响,以及基于原始数据特征对攻击根本原因的论证不足。本文旨在填补这些知识缺口,首先探索统计方法以识别最具信息量的神经元,并量化所选神经元的隐藏激活(单独及组合)对攻击准确性的重要性。此外,我们提出一种攻击驱动的可解释框架,通过整合目标模型和攻击模型,识别导致成功成员推理攻击的原始数据中最具影响力的特征。我们提出的MIA在现有最佳MIA基础上实现了高达26%的性能提升。