Prior-Data Fitted Networks (PFNs) represent a paradigm shift in tabular data prediction. We present the principles of this new paradigm and evaluate two PFNs for estimating the average treatment effect (ATE) of a binary treatment on a binary outcome, using simulated clinical scenarios based on real-world data. We assessed TabPFN combined with causal inference procedures (g-computation and inverse probability of treatment weighting), and CausalPFN, a PFN that directly provides an ATE estimate with a credible interval. Confidence intervals for the TabPFN-based methods were derived using bootstrap resampling. We found that computation times for TabPFN were prohibitive for routine causal inference, particularly because of the need for bootstrapping to yield confidence intervals. Moreover, g-computation with TabPFN produced a highly biased estimator, partially corrected by fitting separate models for each treatment group (T-learner). CausalPFN, by contrast, was computationally efficient but exhibited poor coverage of its 95% credible interval for the ATE, due to both estimation bias and inadequate uncertainty quantification. Beyond automating model specification, some PFN variants - like CausalPFN - attempt to automate causal modeling. In the settings we evaluated, CausalPFN performed poorly. However, new algorithms of this kind continue to be developed, and their application to causal inference tasks requires further investigation.
翻译:先验数据拟合网络(PFN)代表了表格数据预测领域的范式转变。我们阐述了这一新范式的原理,并基于真实世界数据的模拟临床场景,评估了两种用于估计二元处理对二元结果的平均处理效应(ATE)的PFN方法。我们评估了结合因果推断程序(g-计算与逆概率治疗加权)的TabPFN,以及直接提供ATE估计值及其可信区间的CausalPFN。基于TabPFN方法的置信区间通过bootstrap重抽样推导得出。研究发现,TabPFN的计算时间对于常规因果推断而言过于冗长,尤其因为需通过bootstrap获得置信区间。此外,使用TabPFN进行g-计算会产生高度偏倚的估计量,通过为每个治疗组分别拟合模型(T-learner)可部分纠正这一偏差。相比之下,CausalPFN计算效率较高,但其对ATE的95%可信区间覆盖度较差,这源于估计偏倚与不确定性量化不足的双重影响。除自动实现模型规范外,某些PFN变体(如CausalPFN)试图自动完成因果建模。在我们评估的场景中,CausalPFN表现不佳。然而,这类新算法仍在持续发展中,其在因果推断任务中的应用仍需进一步研究。