Foundation models for tabular data, such as the Tabular Prior-data Fitted Network (TabPFN), are pre-trained on a massive number of synthetic datasets generated by structural causal models (SCM). They leverage in-context learning to offer high predictive accuracy in real-world tasks. However, the fairness properties of these foundational models, which incorporate ideas from causal reasoning during pre-training, remain underexplored. In this work, we conduct a comprehensive empirical evaluation of TabPFN and its fine-tuned variants, assessing predictive performance, fairness, and robustness across varying dataset sizes and distributional shifts. Our results reveal that while TabPFN achieves stronger predictive accuracy compared to baselines and exhibits robustness to spurious correlations, improvements in fairness are moderate and inconsistent, particularly under missing-not-at-random (MNAR) covariate shifts. These findings suggest that the causal pre-training in TabPFN is helpful but insufficient for algorithmic fairness, highlighting implications for deploying TabPFN (and similar) models in practice and the need for further fairness interventions.
翻译:针对表格数据的基础模型,例如表格先验数据拟合网络(TabPFN),通过结构因果模型生成的海量合成数据集进行预训练。这些模型利用上下文学习能力,在现实任务中展现出卓越的预测准确性。然而,这些在预训练阶段融入因果推理思想的基础模型,其公平性特质仍未得到充分探究。本研究对TabPFN及其微调变体展开系统性实证评估,从预测性能、公平性和鲁棒性三个维度,考察不同数据规模与分布偏移下的表现。实验结果表明:尽管TabPFN相较于基线模型具有更强的预测准确性,且对伪相关性表现出良好的鲁棒性,但其公平性提升有限且不稳定,尤其在非随机缺失协变量偏移情境下更为明显。这些发现表明,TabPFN中的因果预训练虽具助益,但尚不足以实现算法公平性,这为TabPFN(及同类模型)的实际部署提出了警示,并凸显了进一步实施公平性干预的必要性。