A recently proposed scheme utilizing local noise addition and matrix masking enables data collection while protecting individual privacy from all parties, including the central data manager. Statistical analysis of such privacy-preserved data is particularly challenging for nonlinear models like logistic regression. By leveraging a relationship between logistic regression and linear regression estimators, we propose the first valid statistical analysis method for logistic regression under this setting. Theoretical analysis of the proposed estimators confirmed its validity under an asymptotic framework with increasing noise magnitude to account for strict privacy requirements. Simulations and real data analyses demonstrate the superiority of the proposed estimators over naive logistic regression methods on privacy-preserved data sets.
翻译:最近提出的一种利用局部噪声添加和矩阵掩码的方案,能够在数据收集过程中保护个体隐私免受包括中央数据管理者在内的所有参与方的侵犯。对此类隐私保护数据进行统计分析,对于逻辑回归等非线性模型尤为困难。通过利用逻辑回归与线性回归估计量之间的关系,我们提出了该场景下首个有效的逻辑回归统计分析方法。对所提估计量的理论分析证实了其在渐近框架下的有效性,该框架通过增加噪声幅度来满足严格的隐私要求。仿真和实际数据分析表明,在隐私保护数据集上,所提估计量优于朴素的逻辑回归方法。