We present PPI++: a computationally lightweight methodology for estimation and inference based on a small labeled dataset and a typically much larger dataset of machine-learning predictions. The methods automatically adapt to the quality of available predictions, yielding easy-to-compute confidence sets -- for parameters of any dimensionality -- that always improve on classical intervals using only the labeled data. PPI++ builds on prediction-powered inference (PPI), which targets the same problem setting, improving its computational and statistical efficiency. Real and synthetic experiments demonstrate the benefits of the proposed adaptations.
翻译:我们提出PPI++:一种计算轻量级的方法体系,用于基于小规模标注数据集和通常更大规模的机器学习预测数据集进行估计与推断。该方法能自动适应可用预测的质量,为任意维度的参数生成易于计算的置信集——这些置信集始终优于仅使用标注数据的经典区间估计。PPI++建立在面向相同问题场景的预测驱动推理(PPI)基础上,提升了其计算效率与统计效率。真实数据与合成实验证明了所提改进方法的优势。