This work addresses the problem of constructing reliable prediction intervals for individual counterfactual outcomes. Existing conformal counterfactual inference (CCI) methods provide marginal coverage guarantees but often produce overly conservative intervals, particularly under treatment imbalance when counterfactual samples are scarce. We introduce synthetic data-powered CCI (SP-CCI), a new framework that augments the calibration set with synthetic counterfactual labels generated by a pre-trained counterfactual model. To ensure validity, SP-CCI incorporates synthetic samples into a conformal calibration procedure based on risk-controlling prediction sets (RCPS) with a debiasing step informed by prediction-powered inference (PPI). We prove that SP-CCI achieves tighter prediction intervals while preserving marginal coverage, with theoretical guarantees under both exact and approximate importance weighting. Empirical results on different datasets confirm that SP-CCI consistently reduces interval width compared to standard CCI across all settings.
翻译:本文研究了为个体反事实结果构建可靠预测区间的问题。现有共形反事实推理方法提供边际覆盖保证,但常产生过于保守的区间,尤其在处理不平衡导致反事实样本稀缺时更为显著。我们提出数据驱动的共形反事实推理框架,通过预训练反事实模型生成的合成反事实标签来扩充校准集。为确保有效性,该框架将合成样本纳入基于风险控制预测集的共形校准流程,并结合预测驱动推理的纠偏步骤。理论证明该方法在精确与近似重要性加权下均能实现更紧凑的预测区间,同时保持边际覆盖保证。不同数据集上的实验结果一致表明,该方法在所有设置下均能系统性地缩短标准共形反事实推理方法的区间宽度。