We present a systematic evaluation framework - thirty-seven analyses, 153 statistical tests, four cell types, two perturbation modalities - for assessing mechanistic interpretability in single-cell foundation models. Applying this framework to scGPT and Geneformer, we find that attention patterns encode structured biological information with layer-specific organisation - protein-protein interactions in early layers, transcriptional regulation in late layers - but this structure provides no incremental value for perturbation prediction: trivial gene-level baselines outperform both attention and correlation edges (AUROC 0.81-0.88 versus 0.70), pairwise edge scores add zero predictive contribution, and causal ablation of regulatory heads produces no degradation. These findings generalise from K562 to RPE1 cells; the attention-correlation relationship is context-dependent, but gene-level dominance is universal. Cell-State Stratified Interpretability (CSSI) addresses an attention-specific scaling failure, improving GRN recovery up to 1.85x. The framework establishes reusable quality-control standards for the field.
翻译:我们提出了一个系统性评估框架——包含37项分析、153项统计检验、四种细胞类型、两种扰动模式——用于评估单细胞基础模型的机制可解释性。将该框架应用于scGPT和Geneformer,我们发现注意力模式编码了具有层级特异性组织的结构化生物学信息:早期层捕获蛋白质-蛋白质相互作用,晚期层捕获转录调控,但这种结构并未为扰动预测提供增量价值:简单的基因水平基线模型在性能上超越注意力机制与相关性边(AUROC 0.81-0.88对比0.70),成对边评分未增加任何预测贡献,且调控头部的因果消融未导致性能下降。这些发现在从K562细胞到RPE1细胞中具有普适性;注意力-相关性关系具有情境依赖性,但基因水平的主导地位是普遍存在的。细胞状态分层可解释性(CSSI)方法解决了注意力机制特有的尺度失效问题,将基因调控网络重建性能提升达1.85倍。该框架为该领域建立了可复用的质量控制标准。