Conformal novelty detection is a classical machine learning task for which uncertainty quantification is essential for providing reliable results. Recent work has shown that the BH procedure applied to conformal p-values controls the false discovery rate (FDR). Unfortunately, the BH procedure can lead to over-optimistic assessments near the rejection threshold, with an increase of false discoveries at the margin as pointed out by Soloff et al. (2024). This issue is solved therein by the support line (SL) correction, which is proven to control the boundary false discovery rate (bFDR) in the independent, non-conformal setting. The present work extends the SL method to the conformal setting: first, we show that the SL procedure can violate the bFDR control in this specific setting. Second, we propose several alternatives that provably control the bFDR in the conformal setting. Finally, numerical experiments with both synthetic and real data support our theoretical findings and show the relevance of the new proposed procedures.
翻译:保形新颖性检测是一种经典的机器学习任务,其不确定性量化对于提供可靠结果至关重要。近期研究表明,应用于保形p值的BH程序能够控制错误发现率。然而,正如Soloff等人(2024年)所指出的,BH程序在拒绝阈值附近可能导致过于乐观的评估,在边界处增加错误发现。该问题通过支持线校正得以解决,该方法被证明在独立非保形设定下能够控制边界错误发现率。本研究将SL方法扩展至保形设定:首先,我们证明SL程序在此特定设定下可能违反bFDR控制;其次,我们提出若干在保形设定下可证明控制bFDR的替代方案;最后,基于合成数据与真实数据的数值实验验证了我们的理论发现,并展示了新提出程序的实际有效性。