Gaussian Graphical Models (GGMs) are widely used to infer conditional dependence structures in high-dimensional data. However, standard precision matrix estimators are highly sensitive to data contamination, such as extreme outliers and heavy-tailed noise. In this paper, we propose DROP (Distributionally Robust Optimization), a robust estimation method formulated within a multi-task nodewise regression framework. The proposed estimator enforces structural sparsity while resisting the influence of corrupted observations. Theoretically, we establish error bounds for the DROP estimator under general contamination. Through extensive high-dimensional simulations, we demonstrate that DROP consistently controls the rate of false positive edges and outperforms conventional non-robust estimators when data deviate from standard Gaussian assumptions. Furthermore, in a functional MRI (fMRI) application, DROP maintains a stable graph structure and preserves network modularity even when subjected to severe data perturbations, whereas competing methods yield excessively dense networks. To facilitate reproducible research, the DROP R package will be made publicly available on GitHub.
翻译:高斯图模型(GGMs)被广泛用于推断高维数据中的条件依赖结构。然而,标准精度矩阵估计量对数据污染(如极端异常值和重尾噪声)高度敏感。本文提出DROP(分布鲁棒优化),一种在多任务节点回归框架内构建的鲁棒估计方法。所提估计量在抵抗观测值污染影响的同时施加结构稀疏性。理论上,我们建立了DROP估计量在一般污染下的误差界。通过广泛的高维模拟,我们证明DROP能持续控制假阳性边的比率,并在数据偏离标准高斯假设时优于传统的非鲁棒估计量。此外,在功能性磁共振成像(fMRI)应用中,即使在遭受严重数据扰动时,DROP仍能保持稳定的图结构与网络模块性,而竞争方法则产生过度密集的网络。为促进可重复研究,DROP的R语言软件包将在GitHub上公开发布。