Tabular Foundation Models (TFMs) have recently shown strong in-context learning capabilities on structured data, achieving zero-shot performance comparable to traditional machine learning methods. We find that zero-shot TFMs already achieve strong performance, while the benefits of fine-tuning are highly model and data-dependent. Meta-learning and PEFT provide moderate gains under specific conditions, whereas full supervised fine-tuning (SFT) often reduces accuracy or calibration quality. This work presents the first comprehensive study of fine-tuning in TFMs across benchmarks including TALENT, OpenML-CC18, and TabZilla. We compare Zero-Shot, Meta-Learning, Supervised (SFT), and parameter-efficient (PEFT) approaches, analyzing how dataset factors such as imbalance, size, and dimensionality affect outcomes. Our findings cover performance, calibration, and fairness, offering practical guidelines on when fine-tuning is most beneficial and its limitations.
翻译:表格基础模型(TFMs)近期在结构化数据上展现出强大的上下文学习能力,其零样本性能已接近传统机器学习方法。我们发现零样本TFMs已具备优异性能,而微调的收益高度依赖于模型与数据特性。元学习与参数高效微调(PEFT)在特定条件下能带来有限提升,而全监督微调(SFT)常导致准确率或校准质量下降。本研究首次在TALENT、OpenML-CC18和TabZilla等基准测试中对TFMs微调方法进行系统性探究,对比了零样本学习、元学习、全监督微调及参数高效微调四种范式,并剖析了数据不平衡性、规模与维度等数据集因素对结果的影响。我们的发现涵盖性能表现、校准效果与公平性三个维度,为微调的最佳适用场景及其局限性提供了实践指导。