Uncertainty quantification is crucial in safety-critical systems, where decisions must be made under uncertainty. In particular, we consider the problem of online uncertainty quantification, where data points arrive sequentially. Online conformal prediction is a principled online uncertainty quantification method that dynamically constructs a prediction set at each time step. While existing methods for online conformal prediction provide long-run coverage guarantees without any distributional assumptions, they typically assume a full feedback setting in which the true label is always observed. In this paper, we propose a novel learning method for online conformal prediction with partial feedback from an adaptive adversary-a more challenging setup where the true label is revealed only when it lies inside the constructed prediction set. Specifically, we formulate online conformal prediction as an adversarial bandit problem by treating each candidate prediction set as an arm. Building on an existing algorithm for adversarial bandits, our method achieves a long-run coverage guarantee by explicitly establishing its connection to the regret of the learner. Finally, we empirically demonstrate the effectiveness of our method in both independent and identically distributed (i.i.d.) and non-i.i.d. settings, showing that it successfully controls the miscoverage rate while maintaining a reasonable size of the prediction set.
翻译:摘要:不确定性量化在安全关键系统中至关重要,这类系统需在不确定条件下做出决策。我们特别关注在线不确定性量化问题,其中数据点序列式到达。在线共形预测是一种基于原则的在线不确定性量化方法,可动态构建每个时间步的预测集。现有在线共形预测方法虽能在无分布假设下提供长期覆盖保证,但通常假设完全反馈环境(即真实标签始终可观测)。本文针对自适应对抗性对手提供的部分反馈——一种更具挑战性的设定(真实标签仅在落入所构建预测集内时才会被揭示),提出了一种新颖的在线共形预测学习方法。具体而言,我们将在线共形预测形式化为对抗性赌博机问题,将每个候选预测集视为一个臂。基于现有对抗性赌博机算法,通过明确建立学习者遗憾值与其长期覆盖保证之间的关联,所提方法实现了长期覆盖保证。最后,我们在独立同分布与非独立同分布两种设定下通过实验证明了该方法的有效性,表明其能在保持预测集合理规模的同时成功控制误覆盖率。