位置编码的影响：上下文回归下Transformer的干净与对抗性Rademacher复杂性分析 (Impact of Positional Encoding: Clean and Adversarial Rademacher Complexity for Transformers under In-Context Regression) - 专知论文

会员服务 ·

0

位置编码 · 对抗 · 泛化 · 上下文 · 分析 ·

2025 年 12 月 10 日

Impact of Positional Encoding: Clean and Adversarial Rademacher Complexity for Transformers under In-Context Regression

翻译：位置编码的影响：上下文回归下Transformer的干净与对抗性Rademacher复杂性分析

Weiyi He,Yue Xing

from arxiv, 25 pages, 3 figures

Positional encoding (PE) is a core architectural component of Transformers, yet its impact on the Transformer's generalization and robustness remains unclear. In this work, we provide the first generalization analysis for a single-layer Transformer under in-context regression that explicitly accounts for a completely trainable PE module. Our result shows that PE systematically enlarges the generalization gap. Extending to the adversarial setting, we derive the adversarial Rademacher generalization bound. We find that the gap between models with and without PE is magnified under attack, demonstrating that PE amplifies the vulnerability of models. Our bounds are empirically validated by a simulation study. Together, this work establishes a new framework for understanding the clean and adversarial generalization in ICL with PE.

翻译：位置编码是Transformer架构的核心组成部分，但其对Transformer泛化能力和鲁棒性的影响尚不明确。本研究首次针对单层Transformer在上下文回归任务中的泛化行为进行分析，并明确考虑完全可训练的位置编码模块。结果表明，位置编码会系统性地扩大泛化差距。扩展至对抗性场景，我们推导了对抗性Rademacher泛化界，发现受攻击时含位置编码与不含位置编码模型的性能差距进一步扩大，证明位置编码会加剧模型的脆弱性。通过仿真实验验证了所提理论界的有效性。本研究为理解含位置编码的上下文学习中干净与对抗性泛化行为建立了新的理论框架。

0

相关内容

位置编码

51页《基于Transformer的多模态与自监督学习》最新报告，Google Xiaohua Zhai

51页《基于Transformer的多模态与自监督学习》最新报告，Google Xiaohua Zhai

专知会员服务

68+阅读 · 2023年2月24日

【CVPR 2022】基于实例深度估计的统一深度感知全景分割 PanopticDepth: Per-Instance Depth Estimation for Unified Depth-Aware Panoptic Segmentation

【CVPR 2022】基于实例深度估计的统一深度感知全景分割 PanopticDepth: Per-Instance Depth Estimation for Unified Depth-Aware Panoptic Segmentation

专知会员服务

18+阅读 · 2022年3月19日

【AAAI 2022】 GeomGCL:用于分子性质预测的几何图对比学习

【AAAI 2022】 GeomGCL:用于分子性质预测的几何图对比学习

专知会员服务

24+阅读 · 2022年2月27日

【ICML2021】图对比学习自动化

专知会员服务

41+阅读 · 2021年6月19日

语义相似性算法演化论文，29页pdf，Evolution of Semantic Similarity - A Survey

语义相似性算法演化论文，29页pdf，Evolution of Semantic Similarity - A Survey

专知会员服务

44+阅读 · 2020年4月30日

AAAI 2022 | ProtGNN：自解释图神经网络

AAAI 2022 | ProtGNN：自解释图神经网络

专知

10+阅读 · 2022年2月28日

Python图像处理，366页pdf，Image Operators Image Processing in Python

Python图像处理，366页pdf，Image Operators Image Processing in Python

专知

15+阅读 · 2020年7月23日

【CVPR2020-旷视】DPGN：分布传播图网络的小样本学习

【CVPR2020-旷视】DPGN：分布传播图网络的小样本学习

专知

13+阅读 · 2020年4月1日

【阿里巴巴-WWW2020】对抗性多模态表示学习的点击率预测，Adversarial Multimodal RL

【阿里巴巴-WWW2020】对抗性多模态表示学习的点击率预测，Adversarial Multimodal RL

专知

11+阅读 · 2020年3月17日

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning，33页ppt

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning，33页ppt

专知

72+阅读 · 2020年2月29日

求解时间依赖问题的隐式时空并行 Schwarz 算法研究

国家自然科学基金

0+阅读 · 2017年12月31日

基于图论方法的DNA序列编码研究

国家自然科学基金

2+阅读 · 2016年12月31日

两类Markov排队模型的衰减性质

国家自然科学基金

1+阅读 · 2015年12月31日

SDN数据平面中大规模流表的高性能查找方法研究

国家自然科学基金

4+阅读 · 2015年12月31日

基于决策模型和预备电位的运动想象BCI研究

国家自然科学基金

3+阅读 · 2015年12月31日

AODDiff: Probabilistic Reconstruction of Aerosol Optical Depth via Diffusion-based Bayesian Inference

Arxiv

0+阅读 · 2025年12月31日

Projection-based Adversarial Attack using Physics-in-the-Loop Optimization for Monocular Depth Estimation

Arxiv

0+阅读 · 2025年12月31日

Correctness of Extended RSA Public Key Cryptosystem

Arxiv

0+阅读 · 2025年12月31日

Robust Distributed Estimation: Extending Gossip Algorithms to Ranking and Trimmed Means

Arxiv

0+阅读 · 2025年12月30日

Neighbor-aware Instance Refining with Noisy Labels for Cross-Modal Retrieval

Arxiv

0+阅读 · 2025年12月30日

VIP会员

文章信息

相关主题

相关VIP内容

51页《基于Transformer的多模态与自监督学习》最新报告，Google Xiaohua Zhai

51页《基于Transformer的多模态与自监督学习》最新报告，Google Xiaohua Zhai

专知会员服务

68+阅读 · 2023年2月24日

【CVPR 2022】基于实例深度估计的统一深度感知全景分割 PanopticDepth: Per-Instance Depth Estimation for Unified Depth-Aware Panoptic Segmentation

【CVPR 2022】基于实例深度估计的统一深度感知全景分割 PanopticDepth: Per-Instance Depth Estimation for Unified Depth-Aware Panoptic Segmentation

专知会员服务

18+阅读 · 2022年3月19日

【AAAI 2022】 GeomGCL:用于分子性质预测的几何图对比学习

【AAAI 2022】 GeomGCL:用于分子性质预测的几何图对比学习

专知会员服务

24+阅读 · 2022年2月27日

【ICML2021】图对比学习自动化

专知会员服务

41+阅读 · 2021年6月19日

语义相似性算法演化论文，29页pdf，Evolution of Semantic Similarity - A Survey

语义相似性算法演化论文，29页pdf，Evolution of Semantic Similarity - A Survey

专知会员服务

44+阅读 · 2020年4月30日

热门VIP内容

开通专知VIP会员享更多权益服务

《面向小规模遥感应用引入思维链推理与多模态小语言模型》

《大国竞争时代的美国太空竞争力》50页报告

网络中心战：未来冲突

《自主无人机不会取代战斗机飞行员，将成为其僚机：协同作战飞机是下一代无人作战飞机》报告

相关资讯

AAAI 2022 | ProtGNN：自解释图神经网络

AAAI 2022 | ProtGNN：自解释图神经网络

专知

10+阅读 · 2022年2月28日

Python图像处理，366页pdf，Image Operators Image Processing in Python

Python图像处理，366页pdf，Image Operators Image Processing in Python

专知

15+阅读 · 2020年7月23日

【CVPR2020-旷视】DPGN：分布传播图网络的小样本学习

【CVPR2020-旷视】DPGN：分布传播图网络的小样本学习

专知

13+阅读 · 2020年4月1日

【阿里巴巴-WWW2020】对抗性多模态表示学习的点击率预测，Adversarial Multimodal RL

【阿里巴巴-WWW2020】对抗性多模态表示学习的点击率预测，Adversarial Multimodal RL

专知

11+阅读 · 2020年3月17日

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning，33页ppt

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning，33页ppt

专知

72+阅读 · 2020年2月29日

相关论文

AODDiff: Probabilistic Reconstruction of Aerosol Optical Depth via Diffusion-based Bayesian Inference

Arxiv

0+阅读 · 2025年12月31日

Projection-based Adversarial Attack using Physics-in-the-Loop Optimization for Monocular Depth Estimation

Arxiv

0+阅读 · 2025年12月31日

Correctness of Extended RSA Public Key Cryptosystem

Arxiv

0+阅读 · 2025年12月31日

Robust Distributed Estimation: Extending Gossip Algorithms to Ranking and Trimmed Means

Arxiv

0+阅读 · 2025年12月30日

Neighbor-aware Instance Refining with Noisy Labels for Cross-Modal Retrieval

Arxiv

0+阅读 · 2025年12月30日

相关基金

求解时间依赖问题的隐式时空并行 Schwarz 算法研究

国家自然科学基金

0+阅读 · 2017年12月31日

基于图论方法的DNA序列编码研究

国家自然科学基金

2+阅读 · 2016年12月31日

两类Markov排队模型的衰减性质

国家自然科学基金

1+阅读 · 2015年12月31日

SDN数据平面中大规模流表的高性能查找方法研究

国家自然科学基金

4+阅读 · 2015年12月31日

基于决策模型和预备电位的运动想象BCI研究

国家自然科学基金

3+阅读 · 2015年12月31日

微信扫码咨询专知VIP会员