This study investigates privacy leakage in dimensionality reduction methods through a novel machine learning-based reconstruction attack. Employing an \emph{informed adversary} threat model, we develop a neural network capable of reconstructing high-dimensional data from low-dimensional embeddings. We evaluate six popular dimensionality reduction techniques: PCA, sparse random projection (SRP), multidimensional scaling (MDS), Isomap, $t$-SNE, and UMAP. Using both MNIST and NIH Chest X-ray datasets, we perform a qualitative analysis to identify key factors affecting reconstruction quality. Furthermore, we assess the effectiveness of an additive noise mechanism in mitigating these reconstruction attacks.
翻译:本研究通过一种新颖的基于机器学习的重构攻击方法,探究降维技术中的隐私泄露问题。采用"知情对手"威胁模型,我们构建了一种能够从低维嵌入中重构高维数据的神经网络。我们评估了六种主流降维技术:主成分分析(PCA)、稀疏随机投影(SRP)、多维标度法(MDS)、等距特征映射(Isomap)、$t$-分布随机邻域嵌入($t$-SNE)以及均匀流形逼近与投影(UMAP)。基于MNIST手写数字数据集和美国国立卫生研究院胸部X射线数据集,我们通过定性分析揭示了影响重构质量的关键因素。此外,我们评估了加性噪声机制在缓解此类重构攻击方面的有效性。