In this paper, we investigate the Gaussian graphical model inference problem in a novel setting that we call erose measurements, referring to irregularly measured or observed data. For graphs, this results in different node pairs having vastly different sample sizes which frequently arises in data integration, genomics, neuroscience, and sensor networks. Existing works characterize the graph selection performance using the minimum pairwise sample size, which provides little insights for erosely measured data, and no existing inference method is applicable. We aim to fill in this gap by proposing the first inference method that characterizes the different uncertainty levels over the graph caused by the erose measurements, named GI-JOE (Graph Inference when Joint Observations are Erose). Specifically, we develop an edge-wise inference method and an affiliated FDR control procedure, where the variance of each edge depends on the sample sizes associated with corresponding neighbors. We prove statistical validity under erose measurements, thanks to careful localized edge-wise analysis and disentangling the dependencies across the graph. Finally, through simulation studies and a real neuroscience data example, we demonstrate the advantages of our inference methods for graph selection from erosely measured data.
翻译:本文研究了一种新场景下的高斯图模型推断问题,称为不规则测量(erose measurements),即数据被不规则地测量或观测。对于图模型而言,这导致不同节点对之间样本量差异巨大,该现象常见于数据整合、基因组学、神经科学和传感器网络领域。现有研究使用最小成对样本量来描述图选择性能,但这对于不规则测量数据缺乏有效洞察,且尚无适用的推断方法。为填补这一空白,我们提出首个能够刻画由不规则测量引起的图模型中不同不确定性水平的推断方法,命名为GI-JOE(基于联合观测不规则性的图推断)。具体而言,我们开发了一种边级推断方法及配套的FDR控制流程,其中每条边的方差取决于其相邻节点对应的样本量。通过精细的局部边级分析并解耦图中依赖关系,我们证明了在不规则测量下该方法的统计有效性。最后,通过仿真实验和真实神经科学数据案例,我们展示了所提方法在不规则测量数据图选择中的优势。