Transformer-based tabular foundation models have recently demonstrated promising in-context learning (ICL) performance on structured data, emerging as competitive alternatives to gradient-boosted trees. However, the fairness implications of this new paradigm remain largely unexplored. We present the first investigation of fairness in tabular ICL, evaluating three recently proposed foundation models--TabPFNv2, TabICL, and TabDPT--on multiple benchmark datasets. To mitigate biases, we explore three pre-processing fairness-enhancing methods: correlation removal (decorrelating input features from the sensitive attribute), group-balanced sample selection (ensuring equal representation of protected groups in context examples), and uncertainty-based sample selection (prioritizing context examples with high sensitive-attribute prediction uncertainty). Our experiments show that the uncertainty-based strategy consistently improves group fairness metrics (e.g., demographic parity, equalized odds, and equal opportunity) with minimal impact on predictive accuracy. We release our code to facilitate reproducibility https://github.com/patrikken/Fair-TabICL.
翻译:基于Transformer的表格基础模型最近在结构化数据上展现出有前景的上下文学习性能,成为梯度提升树的有力竞争替代方案。然而,这一新范式的公平性影响在很大程度上仍未得到探索。我们首次对表格上下文学习的公平性进行了研究,在多个基准数据集上评估了三种近期提出的基础模型——TabPFNv2、TabICL和TabDPT。为了缓解偏差,我们探索了三种用于增强公平性的预处理方法:相关性移除(使输入特征与敏感属性解相关)、组平衡样本选择(确保上下文示例中受保护群体具有同等代表性)以及基于不确定性的样本选择(优先选择敏感属性预测不确定性高的上下文示例)。我们的实验表明,基于不确定性的策略能持续改进群体公平性指标(例如,人口统计均等、机会均等和机会平等),同时对预测准确性的影响最小。我们已公开代码以促进可复现性:https://github.com/patrikken/Fair-TabICL。