Empirical studies have identified a range of learnability biases and limitations of transformers, such as a persistent difficulty in learning to compute simple formal languages such as PARITY, and a bias towards low-degree functions. However, theoretical understanding remains limited, with existing expressiveness theory either overpredicting or underpredicting realistic learning abilities. We prove that, under the transformer architecture, the loss landscape is constrained by the input-space sensitivity: Transformers whose output is sensitive to many parts of the input string inhabit isolated points in parameter space, leading to a low-sensitivity bias in generalization. We show theoretically and empirically that this theory unifies a broad array of empirical observations about the learning abilities and biases of transformers, such as their generalization bias towards low sensitivity and low degree, and difficulty in length generalization for PARITY. This shows that understanding transformers' inductive biases requires studying not just their in-principle expressivity, but also their loss landscape.
翻译:实证研究已识别出Transformer的一系列可学习性偏差与局限性,例如在学习计算简单形式语言(如奇偶校验)时持续存在的困难,以及对低次函数的偏好。然而,相关理论理解仍显不足,现有表达能力理论要么高估、要么低估了实际学习能力。我们证明,在Transformer架构下,损失函数的景观受输入空间敏感度约束:输出对输入字符串多个部分敏感的Transformer在参数空间中占据孤立点,从而导致泛化过程中的低敏感度偏差。我们通过理论与实证表明,该理论统一了关于Transformer学习能力与偏差的广泛实证观察,例如其向低敏感度与低次数的泛化偏好,以及在奇偶校验任务中长度泛化的困难。这表明,理解Transformer的归纳偏差不仅需要研究其理论表达能力,还需探究其损失函数的景观。