An important problem in network analysis is predicting a node attribute using both network covariates, such as graph embedding coordinates or local subgraph counts, and conventional node covariates, such as demographic characteristics. While standard regression methods that make use of both types of covariates may be used for prediction, statistical inference is complicated by the fact that the nodal summary statistics are often dependent in complex ways. We show that under a mild joint exchangeability assumption, a network analog of conformal prediction achieves finite sample validity for a wide range of network covariates. We also show that a form of asymptotic conditional validity is achievable. The methods are illustrated on both simulated networks and a citation network dataset.
翻译:网络分析中的一个重要问题是如何利用网络协变量(如图嵌入坐标或局部子图计数)和传统节点协变量(如人口统计特征)来预测节点属性。虽然结合两类协变量的标准回归方法可用于预测,但节点汇总统计量通常存在复杂的依赖关系,这使得统计推断变得困难。我们证明,在温和的联合可交换性假设下,网络模拟的保形预测方法对于广泛类型的网络协变量能够实现有限样本有效性。我们还证明了一种渐近条件有效性是可实现的。这些方法在模拟网络和引文网络数据集上均进行了验证。