Robustness of Similarity-based Positional Encoding Under Rotations: Theoretical Analysis and Experimental Validation

Positional encoding is a fundamental component of Transformer architectures, as it injects information about the spatial or sequential arrangement of inputs. Among recent alternatives to standard absolute and sinusoidal encodings, similarity-based positional encoding (simPE) has emerged as a flexible framework for representing positional structure through pairwise relations. simPE was originally designed for medical imaging applications, where geometric robustness is especially relevant: small rotations naturally arise during image acquisition, induced by imaging instruments, patient positioning, or slight acquisition misalignments. Despite its empirical promise, the theoretical behavior of simPE under geometric perturbations has not been fully characterized. In this paper, we study the robustness of simPE with respect to rotations, combining formal theoretical analysis with experimental validation. We first show that simPE is generally not rotation-invariant. We then prove that, under mild Lipschitz assumptions on the elementary components, simPE is stable under rotational perturbations and derive explicit perturbation bounds in Frobenius norm. We validate these findings experimentally on four controlled datasets--a synthetic Arrow dataset, a synthetic Shapes dataset (four geometric shape categories), a synthetic Digits dataset, and a benchmark image classification dataset (FashionMNIST)--in which training and validation images are kept in a fixed canonical orientation while test images are subjected to increasing rotation angles. Across all datasets, simPE consistently outperforms standard learned positional encoding in terms of accuracy, F1 score, precision, and recall under rotation, particularly in the small-to-moderate angle regime, corroborating the theoretical stability guarantees.

翻译：位置编码是Transformer架构的核心组件，用于注入输入的空间或序列排列信息。在标准绝对位置编码与正弦编码等最新替代方案中，基于相似性的位置编码（simPE）通过成对关系表征位置结构，展现出灵活的框架特性。simPE最初面向医学影像应用设计，其几何鲁棒性具有特殊重要性：图像采集过程中，由成像设备、患者体位或轻微采集偏差引发的微小旋转普遍存在。尽管该编码方法在实证中展现出潜力，但几何扰动下simPE的理论特性尚未被充分刻画。本文通过形式化理论分析与实验验证相结合的方式，系统研究了simPE在旋转扰动下的鲁棒性。首先证明simPE通常不具备旋转不变性。继而论证在基本组件满足利普希茨连续条件的温和假设下，simPE对旋转扰动具有稳定性，并推导出弗罗贝尼乌斯范数下的显式扰动界。我们在四个受控数据集上验证了理论发现：合成Arrow数据集、合成Shapes数据集（四种几何形状类别）、合成Digits数据集以及基准图像分类数据集FashionMNIST。实验设定中，训练集与验证集图像保持固定规范朝向，测试集则施加递增旋转角度。所有数据集上的结果表明，simPE在旋转条件下的准确率、F1分数、精确率与召回率指标均持续优于标准可学习位置编码，尤其在小幅至中等旋转角度范围内表现突出，这证实了理论稳定性保证的有效性。