Positional encoding (PE) is a core architectural component of Transformers, yet its impact on the Transformer's generalization and robustness remains unclear. In this work, we provide the first generalization analysis for a single-layer Transformer under in-context regression that explicitly accounts for a completely trainable PE module. Our result shows that PE systematically enlarges the generalization gap. Extending to the adversarial setting, we derive the adversarial Rademacher generalization bound. We find that the gap between models with and without PE is magnified under attack, demonstrating that PE amplifies the vulnerability of models. Our bounds are empirically validated by a simulation study. Together, this work establishes a new framework for understanding the clean and adversarial generalization in ICL with PE.
翻译:位置编码是Transformer架构的核心组成部分,但其对Transformer泛化能力和鲁棒性的影响尚不明确。本研究首次针对单层Transformer在上下文回归任务中的泛化行为进行分析,并明确考虑完全可训练的位置编码模块。结果表明,位置编码会系统性地扩大泛化差距。扩展至对抗性场景,我们推导了对抗性Rademacher泛化界,发现受攻击时含位置编码与不含位置编码模型的性能差距进一步扩大,证明位置编码会加剧模型的脆弱性。通过仿真实验验证了所提理论界的有效性。本研究为理解含位置编码的上下文学习中干净与对抗性泛化行为建立了新的理论框架。