Attention matrices are fundamental to transformer research, supporting a broad range of applications including interpretability, visualization, manipulation, and distillation. Yet, most existing analyses focus on individual attention heads or layers, failing to account for the model's global behavior. While prior efforts have extended attention formulations across multiple heads via averaging and matrix multiplications or incorporated components such as normalization and FFNs, a unified and complete representation that encapsulates all transformer blocks is still lacking. We address this gap by introducing TensorLens, a novel formulation that captures the entire transformer as a single, input-dependent linear operator expressed through a high-order attention-interaction tensor. This tensor jointly encodes attention, FFNs, activations, normalizations, and residual connections, offering a theoretically coherent and expressive linear representation of the model's computation. TensorLens is theoretically grounded and our empirical validation shows that it yields richer representations than previous attention-aggregation methods. Our experiments demonstrate that the attention tensor can serve as a powerful foundation for developing tools aimed at interpretability and model understanding. Our code is attached as a supplementary.
翻译:注意力矩阵是Transformer研究的基石,支撑着包括可解释性、可视化、操控与蒸馏在内的广泛应用。然而,现有分析大多聚焦于单个注意力头或层,未能考虑模型的全局行为。尽管先前研究已通过平均与矩阵乘法将注意力公式扩展至多头,或整合了归一化与前馈网络等组件,但仍缺乏一个能囊括所有Transformer模块的统一且完整的表示。为填补这一空白,我们提出了TensorLens,这是一种新颖的公式化方法,它将整个Transformer捕获为一个单一的、依赖于输入的线性算子,并通过一个高阶注意力交互张量来表达。该张量联合编码了注意力机制、前馈网络、激活函数、归一化操作与残差连接,为模型计算提供了一个理论上连贯且富有表现力的线性表示。TensorLens具有坚实的理论基础,我们的实证验证表明,它比以往的注意力聚合方法能产生更丰富的表示。实验证明,注意力张量可作为开发旨在提升可解释性与模型理解工具的强大基础。我们的代码已作为补充材料附上。