On the Robustness of Tabular Foundation Models: Test-Time Attacks and In-Context Defenses

from arxiv, This work has been accepted for publication at the IEEE Conference on Secure and Trustworthy Machine Learning (SaTML). The final version will be available on IEEE Xplore. To IEEE SaTML 2026

Recent tabular Foundational Models (FM) such as TabPFN and TabICL, leverage in-context learning to achieve strong performance without gradient updates or fine-tuning. However, their robustness to adversarial manipulation remains largely unexplored. In this work, we present a comprehensive study of the adversarial vulnerabilities of tabular FM, focusing on both their fragility to targeted test-time attacks and their potential misuse as adversarial tools. We show on three benchmarks in finance, cybersecurity and healthcare, that small, structured perturbations to test inputs can significantly degrade prediction accuracy, even when training context remain fixed. Additionally, we demonstrate that tabular FM can be repurposed to generate transferable evasion to conventional models such as random forests and XGBoost, and on a lesser extent to deep tabular models. To improve tabular FM, we formulate the robustification problem as an optimization of the weights (adversarial fine-tuning), or the context (adversarial in-context learning). We introduce an in-context adversarial training strategy that incrementally replaces the context with adversarial perturbed instances, without updating model weights. Our approach improves robustness across multiple tabular benchmarks. Together, these findings position tabular FM as both a target and a source of adversarial threats, highlighting the urgent need for robust training and evaluation practices in this emerging paradigm.

翻译：近期，TabPFN 和 TabICL 等表格基础模型利用上下文学习，在不进行梯度更新或微调的情况下取得了强劲的性能。然而，它们对对抗性操控的鲁棒性在很大程度上仍未得到探索。在本工作中，我们对表格基础模型的对抗性脆弱性进行了全面研究，重点关注其对定向测试时攻击的脆弱性以及被用作对抗性工具的潜在风险。我们在金融、网络安全和医疗保健三个基准测试上表明，即使训练上下文保持不变，对测试输入施加微小的结构性扰动也会显著降低预测准确度。此外，我们证明表格基础模型可被重新利用以生成对随机森林和 XGBoost 等传统模型具有迁移性的逃避攻击，在较小程度上也对深度表格模型有效。为了改进表格基础模型，我们将其鲁棒性增强问题表述为对权重（对抗性微调）或上下文（对抗性上下文学习）的优化。我们引入了一种上下文对抗性训练策略，该策略在不更新模型权重的情况下，逐步用对抗性扰动实例替换上下文。我们的方法在多个表格基准测试上提升了鲁棒性。综上，这些发现将表格基础模型定位为对抗性威胁的目标与来源，凸显了在这一新兴范式中进行鲁棒训练与评估实践的迫切需求。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【MIT博士论文】理解与提升机器学习模型的表征鲁棒性

专知会员服务

29+阅读 · 2024年8月26日