Privacy preservation has become a critical concern in high-dimensional data analysis due to the growing prevalence of data-driven applications. Proposed by Li (1991), sliced inverse regression has emerged as a widely utilized statistical technique for reducing covariate dimensionality while maintaining sufficient statistical information. In this paper, we propose optimally differentially private algorithms specifically designed to address privacy concerns in the context of sufficient dimension reduction. We proceed to establish lower bounds for differentially private sliced inverse regression in both the low and high-dimensional settings. Moreover, we develop differentially private algorithms that achieve the minimax lower bounds up to logarithmic factors. Through a combination of simulations and real data analysis, we illustrate the efficacy of these differentially private algorithms in safeguarding privacy while preserving vital information within the reduced dimension space. As a natural extension, we can readily offer analogous lower and upper bounds for differentially private sparse principal component analysis, a topic that may also be of potential interest to the statistical and machine learning community.
翻译:隐私保护已成为高维数据分析中的关键问题,其重要性随着数据驱动应用的日益普及而凸显。由Li(1991)提出的切片逆回归是一种广泛使用的统计技术,旨在降低协变量维度的同时保留足够的统计信息。本文针对充分降维背景下的隐私保护问题,提出了最优的差分隐私算法。我们进一步建立了低维和高维场景下差分隐私切片逆回归的下界。此外,我们开发的差分隐私算法在达到极小化最优下界(最多相差对数因子)方面表现出色。通过模拟实验和实际数据分析,我们展示了这些差分隐私算法在保护隐私的同时,在降维空间中保留关键信息的有效性。作为自然延伸,我们可轻易得出差分隐私稀疏主成分分析的类似上下界,这一主题可能对统计学和机器学习界具有潜在的研究价值。