Spatial autoregressive (SAR) models are important tools for studying network effects. However, with an increasing emphasis on data privacy, data providers often implement privacy protection measures that make classical SAR models inapplicable. In this study, we introduce a privacy-protected SAR model with noise-added response and covariates to meet privacy-protection requirements. However, in this scenario, the traditional quasi-maximum likelihood estimator becomes infeasible because the likelihood function cannot be formulated. To address this issue, we first consider an explicit expression for the likelihood function with only noise-added responses. However, the derivatives are biased owing to the noise in the covariates. Therefore, we develop techniques that can correct the biases introduced by noise. Correspondingly, a Newton-Raphson-type algorithm is proposed to obtain the estimator, leading to a corrected likelihood estimator. To further enhance computational efficiency, we introduce a corrected least squares estimator based on the idea of bias correction. These two estimation methods ensure both data security and the attainment of statistically valid estimators. Theoretical analysis of both estimators is carefully conducted, and statistical inference methods are discussed. The finite sample performances of different methods are demonstrated through extensive simulations and the analysis of a real dataset.
翻译:空间自回归(SAR)模型是研究网络效应的重要工具。然而,随着对数据隐私的日益重视,数据提供者通常会实施隐私保护措施,这使得经典SAR模型无法直接应用。本研究引入了一种带有噪声添加响应和协变量的隐私保护SAR模型,以满足隐私保护要求。但在这一场景下,传统拟极大似然估计量因无法构建似然函数而不可行。为解决这一问题,我们首先考虑了仅包含噪声添加响应的似然函数显式表达式。然而,由于协变量中的噪声,导数存在偏差。为此,我们开发了能够校正噪声引入偏差的技术,并提出了一种基于牛顿-拉夫逊型算法来获取估计量,从而得到校正似然估计量。为进一步提升计算效率,我们基于偏差校正思想引入了一种校正最小二乘估计量。这两种估计方法在确保数据安全的同时,实现了统计上有效的估计量。我们对两种估计量进行了严谨的理论分析,并讨论了统计推断方法。通过大量仿真实验和真实数据集分析,展示了不同方法在有限样本下的性能表现。