Pairwise likelihood is a useful approximation to the full likelihood function for covariance estimation in high-dimensional context. It simplifies high-dimensional dependencies by combining marginal bivariate likelihood objects, thus making estimation more manageable. In certain models, including the Gaussian model, both pairwise and full likelihoods are maximized by the same parameter values, thus retaining optimal statistical efficiency, when the number of variables is fixed. Leveraging on this insight, we introduce estimation of sparse high-dimensional covariance matrices by maximizing a truncated version of the pairwise likelihood function, obtained by including pairwise terms corresponding to nonzero covariance elements. To achieve a meaningful truncation, we propose to minimize the $L_2$-distance between pairwise and full likelihood scores plus an $L_1$-penalty discouraging the inclusion of uninformative terms. Differently from other regularization approaches, our method focuses on selecting whole pairwise likelihood objects rather than shrinking individual covariance parameters, thus retaining the inherent unbiasedness of the pairwise likelihood estimating equations. This selection procedure is shown to have the selection consistency property as the covariance dimension increases exponentially fast. Consequently, the implied pairwise likelihood estimator is consistent and converges to the oracle maximum likelihood estimator assuming knowledge of nonzero covariance entries.
翻译:在高维背景下,成对似然是全似然函数的一种有效近似,可用于协方差估计。它通过组合边缘二元似然对象来简化高维依赖关系,从而使估计更易处理。在某些模型(包括高斯模型)中,当变量数量固定时,成对似然与全似然的最大值点对应相同的参数值,因此保持了最优统计效率。基于这一洞见,我们通过最大化成对似然函数的截断版本,引入稀疏高维协方差矩阵的估计方法,该截断版本通过纳入非零协方差元素对应的成对项获得。为实现有意义的截断,我们提出最小化成对似然与全似然得分之间的$L_2$距离,并添加$L_1$惩罚项以排除非信息项。与其他正则化方法不同,我们的方法侧重于选择完整的成对似然对象,而非压缩单个协方差参数,从而保留了成对似然估计方程固有的无偏性。该选择程序被证明具有选择一致性特性,且协方差维度呈指数级增长时仍成立。因此,在已知非零协方差条目的假设下,隐含的成对似然估计量具有一致性,并收敛于理想的最大似然估计量。