A discrete spatial lattice can be cast as a network structure over which spatially-correlated outcomes are observed. A second network structure may also capture similarities among measured features, when such information is available. Incorporating the network structures when analyzing such doubly-structured data can improve predictive power, and lead to better identification of important features in the data-generating process. Motivated by applications in spatial disease mapping, we develop a new doubly regularized regression framework to incorporate these network structures for analyzing high-dimensional datasets. Our estimators can easily be implemented with standard convex optimization algorithms. In addition, we describe a procedure to obtain asymptotically valid confidence intervals and hypothesis tests for our model parameters. We show empirically that our framework provides improved predictive accuracy and inferential power compared to existing high-dimensional spatial methods. These advantages hold given fully accurate network information, and also with networks which are partially misspecified or uninformative. The application of the proposed method to modeling COVID-19 mortality data suggests that it can improve prediction of deaths beyond standard spatial models, and that it selects relevant covariates more often.
翻译:离散空间晶格可视为网络结构,在该结构上观测到空间相关结果。当可获得测量特征间的相似性信息时,第二个网络结构可捕捉此类相似性。在分析此类双结构数据时纳入网络结构可提升预测能力,并更有效识别数据生成过程中的重要特征。受空间疾病制图应用启发,我们开发了一种新的双正则化回归框架以整合这些网络结构来分析高维数据集。该估计量可通过标准凸优化算法便捷实现。此外,我们描述了获取模型参数渐近有效置信区间和假设检验的程序。实验表明,与现有高维空间方法相比,本框架具有更优的预测精度和推断能力。这些优势在完全准确的网络信息条件下成立,在部分错误设定或无信息网络中同样成立。将该方法应用于COVID-19死亡率数据建模的结果表明,其可超越标准空间模型改进死亡预测,并更频繁地选择相关协变量。