We present the first causal mechanistic analysis of a tabular foundation model, investigating how TabPFN 2.5's feature wise attention heads distribute computation across layers. Using activation patching, ablation, and attention entropy across two synthetic regression datasets, we find clear temporal specialisation: one head's causal necessity dominates that of the others by 2 to 5 times at peak layer, with its dominant layer shifting across tasks of different complexity, while the remaining heads exhibit symmetric late layer profiles. Attention entropy and patching provide convergent evidence for the computationally active layers of the dominant head. We additionally investigate inference time steerability via contrastive activation steering, which fails to transfer across samples. We attribute this result to TabPFN's in context learning mechanism, which encodes task structure through context dependent attention rather than the stable parametric directions that make steering tractable in language models.
翻译:本文首次对表格基础模型进行了因果机制分析,探究TabPFN 2.5中基于特征的注意力头如何跨层分配计算过程。通过结合激活修补、消融实验及注意力熵方法,基于两个合成回归数据集,我们发现了清晰的时间特异性:主导注意力头在峰值层的因果必要性强度是其他头的2至5倍,且其主导层随任务复杂度变化而迁移,而其余注意力头则呈现对称的后期层分布特征。注意力熵与修补实验共同验证了主导头的计算活跃层。此外,我们探究了通过对比激活引导实现推理时可控性的可能性,发现该方法无法跨样本迁移。我们将其归因于TabPFN的上下文学习机制——该机制通过依赖上下文的注意力编码任务结构,而非语言模型中支持可控引导的稳定参数化方向。