Foundation models for tabular data, such as the Tabular Prior-data Fitted Network (TabPFN), are pre-trained on a massive number of synthetic datasets generated by structural causal models (SCM). They leverage in-context learning to offer high predictive accuracy in real-world tasks. However, the fairness properties of these foundational models, which incorporate ideas from causal reasoning during pre-training, have not yet been explored in sufficient depth. In this work, we conduct a comprehensive empirical evaluation of TabPFN and its fine-tuned variants, assessing predictive performance, fairness, and robustness across varying dataset sizes and distributional shifts. Our results reveal that while TabPFN achieves stronger predictive accuracy compared to baselines and exhibits robustness to spurious correlations, improvements in fairness are moderate and inconsistent, particularly under missing-not-at-random (MNAR) covariate shifts. These findings suggest that the causal pre-training in TabPFN is helpful but insufficient for algorithmic fairness, highlighting implications for deploying such models in practice and the need for further fairness interventions.
翻译:面向表格数据的基础模型,例如表格先验数据拟合网络(TabPFN),通过结构因果模型(SCM)生成的大量合成数据集进行预训练。它们利用上下文学习能力,在现实任务中提供高预测精度。然而,这些在预训练阶段融入了因果推理思想的基础模型,其公平性特性尚未得到充分探究。本研究对TabPFN及其微调变体进行了全面的实证评估,考察了不同数据规模与分布偏移下的预测性能、公平性与鲁棒性。结果表明,尽管TabPFN相较于基线方法实现了更强的预测精度,并展现出对虚假相关性的鲁棒性,但其公平性提升有限且不稳定,尤其在非随机缺失(MNAR)协变量偏移下表现明显。这些发现表明,TabPFN中的因果预训练虽有益处,但不足以确保算法公平性,这为实际部署此类模型提供了重要启示,并凸显了进一步实施公平性干预的必要性。