ExplainerPFN: Towards tabular foundation models for model-free zero-shot feature importance estimations

Computing the importance of features in supervised classification tasks is critical for model interpretability. Shapley values are a widely used approach for explaining model predictions, but require direct access to the underlying model, an assumption frequently violated in real-world deployments. Further, even when model access is possible, their exact computation may be prohibitively expensive. We investigate whether meaningful Shapley value estimations can be obtained in a zero-shot setting, using only the input data distribution and no evaluations of the target model. To this end, we introduce ExplainerPFN, a tabular foundation model built on TabPFN that is pretrained on synthetic datasets generated from random structural causal models and supervised using exact or near-exact Shapley values. Once trained, ExplainerPFN predicts feature attributions for unseen tabular datasets without model access, gradients, or example explanations. Our contributions are fourfold: (1) we show that few-shot learning-based explanations can achieve high fidelity to SHAP values with as few as two reference observations; (2) we propose ExplainerPFN, the first zero-shot method for estimating Shapley values without access to the underlying model or reference explanations; (3) we provide an open-source implementation of ExplainerPFN, including the full training pipeline and synthetic data generator; and (4) through extensive experiments on real and synthetic datasets, we show that ExplainerPFN achieves performance competitive with few-shot surrogate explainers that rely on 2-10 SHAP examples.

翻译：在监督分类任务中计算特征重要性对于模型可解释性至关重要。沙普利值是解释模型预测的广泛使用的方法，但需要直接访问底层模型，这一假设在实际部署中经常无法满足。此外，即使可以访问模型，其精确计算也可能成本过高。我们研究了是否可以在零样本设置中获得有意义的沙普利值估计，仅使用输入数据分布而不评估目标模型。为此，我们提出了ExplainerPFN，这是一个基于TabPFN构建的表格基础模型，通过在随机结构因果模型生成的合成数据集上进行预训练，并使用精确或接近精确的沙普利值进行监督。训练完成后，ExplainerPFN能够在无需访问模型、梯度或示例解释的情况下，为未见过的表格数据集预测特征归因。我们的贡献有四个方面：（1）我们证明了基于少样本学习的解释方法仅需两个参考观测即可实现与SHAP值的高度一致性；（2）我们提出了ExplainerPFN，这是首个无需访问底层模型或参考解释即可估计沙普利值的零样本方法；（3）我们提供了ExplainerPFN的开源实现，包括完整的训练流程和合成数据生成器；（4）通过在真实和合成数据集上的大量实验，我们表明ExplainerPFN的性能可与依赖2-10个SHAP示例的少样本代理解释器相媲美。