This study develops a novel framework for privacy-preserving data analytics, addressing the critical challenge of balancing data utility with privacy concerns. We introduce three sophisticated algorithms: a Noise-Infusion Technique tailored for high-dimensional image data, a Variational Autoencoder (VAE) for robust feature extraction while masking sensitive attributes and an Expectation Maximization (EM) approach optimized for structured data privacy. Applied to datasets such as Modified MNIST and CelebrityA, our methods significantly reduce mutual information between sensitive attributes and transformed data, thereby enhancing privacy. Our experimental results confirm that these approaches achieve superior privacy protection and retain high utility, making them viable for practical applications where both aspects are crucial. The research contributes to the field by providing a flexible and effective strategy for deploying privacy-preserving algorithms across various data types and establishing new benchmarks for utility and confidentiality in data analytics.
翻译:本研究开发了一种面向隐私保护数据分析的新框架,旨在解决数据效用与隐私关切之间平衡的关键挑战。我们提出了三种复杂算法:针对高维图像数据定制的噪声注入技术、用于在遮蔽敏感属性的同时进行鲁棒特征提取的变分自编码器(VAE),以及针对结构化数据隐私优化的期望最大化(EM)方法。通过对改进版MNIST和CelebrityA等数据集的实验,我们的方法显著降低了敏感属性与变换后数据之间的互信息,从而增强了隐私保护。实验结果表明,这些方法在实现优异隐私保护的同时保持了高数据效用,使其在隐私与效用并重的实际应用中具有可行性。本研究通过为跨数据类型部署隐私保护算法提供灵活有效的策略,并建立数据分析中效用与机密性的新基准,为该领域做出了贡献。