Split conformal prediction has recently sparked great interest due to its ability to provide formally guaranteed uncertainty sets or intervals for predictions made by black-box neural models, ensuring a predefined probability of containing the actual ground truth. While the original formulation assumes data exchangeability, some extensions handle non-exchangeable data, which is often the case in many real-world scenarios. In parallel, some progress has been made in conformal methods that provide statistical guarantees for a broader range of objectives, such as bounding the best $F_1$-score or minimizing the false negative rate in expectation. In this paper, we leverage and extend these two lines of work by proposing non-exchangeable conformal risk control, which allows controlling the expected value of any monotone loss function when the data is not exchangeable. Our framework is flexible, makes very few assumptions, and allows weighting the data based on its relevance for a given test example; a careful choice of weights may result on tighter bounds, making our framework useful in the presence of change points, time series, or other forms of distribution drift. Experiments with both synthetic and real world data show the usefulness of our method.
翻译:分割共形预测因其能够为黑箱神经网络模型的预测提供具有正式保证的不确定集或区间(确保以预定概率包含真实结果)而近来引起广泛兴趣。虽然原始公式假设数据可交换,但一些扩展方法已能处理非可交换数据——这正是许多现实场景中的常见情况。与此同时,共形方法在更广泛目标的统计保证方面取得进展,例如约束最优$F_1$分数或最小化期望假阴性率。本文通过提出非可交换共形风险控制,将这两个研究方向进行整合与拓展,该方法允许在数据非可交换时控制任意单调损失函数的期望值。我们的框架具有灵活性、假设条件极少,且允许根据给定测试样本的相关性对数据进行加权;精心选择的权重可生成更紧的界,使其在存在变化点、时间序列或其他形式分布漂移时具有实用价值。合成数据与真实数据实验均证明了该方法的有效性。