This paper explores inference-time data leakage risks of deep neural networks (NNs), where a curious and honest model service provider is interested in retrieving users' private data inputs solely based on the model inference results. Particularly, we revisit residual NNs due to their popularity in computer vision and our hypothesis that residual blocks are a primary cause of data leakage owing to the use of skip connections. By formulating inference-time data leakage as a constrained optimization problem, we propose a novel backward feature inversion method, \textbf{PEEL}, which can effectively recover block-wise input features from the intermediate output of residual NNs. The surprising results in high-quality input data recovery can be explained by the intuition that the output from these residual blocks can be considered as a noisy version of the input and thus the output retains sufficient information for input recovery. We demonstrate the effectiveness of our layer-by-layer feature inversion method on facial image datasets and pre-trained classifiers. Our results show that PEEL outperforms the state-of-the-art recovery methods by an order of magnitude when evaluated by mean squared error (MSE). The code is available at \href{https://github.com/Huzaifa-Arif/PEEL}{https://github.com/Huzaifa-Arif/PEEL}
翻译:本文探讨了深度神经网络在推理时存在的数据泄露风险,即一个好奇但诚实的模型服务提供商仅基于模型推理结果,便试图获取用户的私有输入数据。特别地,我们重新审视了残差神经网络,这既因其在计算机视觉领域的广泛应用,也基于我们的假设:由于跳跃连接的使用,残差块是导致数据泄露的主要原因。通过将推理时数据泄露问题形式化为一个约束优化问题,我们提出了一种新颖的反向特征反演方法——**PEEL**,该方法能够有效地从残差神经网络的中间输出中恢复逐块的输入特征。高质量输入数据恢复的惊人结果可以通过以下直觉解释:这些残差块的输出可被视为输入的带噪版本,因此输出保留了足够用于输入恢复的信息。我们在人脸图像数据集和预训练分类器上验证了这种逐层特征反演方法的有效性。结果表明,以均方误差(MSE)评估时,PEEL 的性能比现有最优的恢复方法高出一个数量级。代码可在 \href{https://github.com/Huzaifa-Arif/PEEL}{https://github.com/Huzaifa-Arif/PEEL} 获取。