CD8+ "killer" T cells and CD4+ "helper" T cells play a central role in the adaptive immune system by recognizing antigens presented by Major Histocompatibility Complex (pMHC) molecules via T Cell Receptors (TCRs). Modeling binding between T cells and the pMHC complex is fundamental to understanding basic mechanisms of human immune response as well as in developing therapies. While transformer-based models such as TULIP have achieved impressive performance in this domain, their black-box nature precludes interpretability and thus limits a deeper mechanistic understanding of T cell response. Most existing post-hoc explainable AI (XAI) methods are confined to encoder-only, co-attention, or model-specific architectures and cannot handle encoder-decoder transformers used in TCR-pMHC modeling. To address this gap, we propose Quantifying Cross-Attention Interaction (QCAI), a new post-hoc method designed to interpret the cross-attention mechanisms in transformer decoders. Quantitative evaluation is a challenge for XAI methods; we have compiled TCR-XAI, a benchmark consisting of 274 experimentally determined TCR-pMHC structures to serve as ground truth for binding. Using these structures we compute physical distances between relevant amino acid residues in the TCR-pMHC interaction region and evaluate how well our method and others estimate the importance of residues in this region across the dataset. We show that QCAI achieves state-of-the-art performance on both interpretability and prediction accuracy under the TCR-XAI benchmark.
翻译:CD8+“杀伤性”T细胞与CD4+“辅助性”T细胞通过T细胞受体(TCR)识别主要组织相容性复合体(pMHC)分子呈递的抗原,在适应性免疫系统中发挥核心作用。建模T细胞与pMHC复合体之间的结合对于理解人类免疫应答的基本机制及开发治疗方法至关重要。尽管基于Transformer的模型(如TULIP)在该领域取得了显著性能,其黑箱特性阻碍了可解释性,从而限制了对T细胞应答机制的深入理解。现有大多数事后可解释人工智能(XAI)方法局限于仅编码器、共注意力或特定模型架构,无法处理TCR-pMHC建模中使用的编码器-解码器Transformer。为填补这一空白,我们提出量化交叉注意力交互(QCAI)——一种专为解释Transformer解码器中交叉注意力机制设计的新型事后解释方法。XAI方法的定量评估存在挑战;我们构建了TCR-XAI基准数据集,包含274个经实验确定的TCR-pMHC结构作为结合真实值。利用这些结构,我们计算TCR-pMHC相互作用区内相关氨基酸残基的物理距离,并评估本方法与其他方法在整个数据集中对该区域残基重要性估计的准确度。实验表明,在TCR-XAI基准测试中,QCAI在可解释性与预测准确性方面均达到最先进性能。