An open scientific challenge is how to classify events with reliable measures of uncertainty, when we have a mechanistic model of the data-generating process but the distribution over both labels and latent nuisance parameters is different between train and target data. We refer to this type of distributional shift as generalized label shift (GLS). Direct classification using observed data $\mathbf{X}$ as covariates leads to biased predictions and invalid uncertainty estimates of labels $Y$. We overcome these biases by proposing a new method for robust uncertainty quantification that casts classification as a hypothesis testing problem under nuisance parameters. The key idea is to estimate the classifier's receiver operating characteristic (ROC) across the entire nuisance parameter space, which allows us to devise cutoffs that are invariant under GLS. Our method effectively endows a pre-trained classifier with domain adaptation capabilities and returns valid prediction sets while maintaining high power. We demonstrate its performance on two challenging scientific problems in biology and astroparticle physics with data from realistic mechanistic models.
翻译:一个开放的科学挑战是:当我们拥有数据生成过程的机制模型,但训练数据与目标数据在标签和潜在干扰参数上的分布均存在差异时,如何对事件进行分类并给出可靠的不确定性度量。我们将此类分布偏移称为广义标签偏移(GLS)。直接使用观测数据 $\mathbf{X}$ 作为协变量进行分类会导致标签 $Y$ 的预测存在偏差且不确定性估计无效。我们通过提出一种新的鲁棒不确定性量化方法克服了这些偏差,该方法将分类问题转化为干扰参数下的假设检验问题。其核心思想是在整个干扰参数空间内估计分类器的接收者操作特征(ROC),从而设计出在GLS下保持不变的分类阈值。我们的方法有效赋予预训练分类器领域自适应能力,在保持高统计功效的同时返回有效的预测集。我们在生物学和天体粒子物理学两个具有挑战性的科学问题上,使用基于现实机制模型生成的数据验证了该方法的性能。