We study the properties of conformal prediction for network data under various sampling mechanisms that commonly arise in practice but often result in a non-representative sample of nodes. We interpret these sampling mechanisms as selection rules applied to a superpopulation and study the validity of conformal prediction conditional on an appropriate selection event. We show that the sampled subarray is exchangeable conditional on the selection event if the selection rule satisfies a permutation invariance property and a joint exchangeability condition holds for the superpopulation. Our result implies the finite-sample validity of conformal prediction for certain selection events related to ego networks and snowball sampling. We also show that when data are sampled via a random walk on a graph, a variant of weighted conformal prediction yields asymptotically valid prediction sets for an independently selected node from the population.
翻译:我们研究了常见于实践中但常导致节点样本非代表性的多种采样机制下网络数据的共形预测性质。将这些采样机制解释为应用于超总体的选择规则后,我们探讨了在适当选择事件条件下共形预测的有效性。研究表明:当选择规则满足置换不变性且超总体满足联合可交换性条件时,采样子数组在选择事件条件下具有可交换性。该结论表明,对于自我中心网络和滚雪球采样相关的特定选择事件,共形预测具有有限样本有效性。同时,我们证明当数据通过图上的随机游走采样时,加权共形预测的变体能够为从总体中独立选择的节点提供渐近有效的预测集。