Universal priors: solving empirical Bayes via Bayesian inference and pretraining - 专知论文

会员服务 ·

0

大学 · 推断 · 贝叶斯推断 · 变换 · Performer ·

Universal priors: solving empirical Bayes via Bayesian inference and pretraining

翻译：暂无翻译

Nick Cannella,Anzo Teh,Yanjun Han,Yury Polyanskiy

from arxiv, To appear at COLT 2026. 43 pages, 5 figures. Code release at https://github.com/Anzoteh96/eb-transformers

We theoretically justify the recent empirical finding of [Teh et al., 2025] that a transformer pretrained on synthetically generated data achieves strong performance on empirical Bayes (EB) problems. We take an indirect approach to this question: rather than analyzing the model architecture or training dynamics, we ask why a pretrained Bayes estimator, trained under a prespecified training distribution, can adapt to arbitrary test distributions. Focusing on Poisson EB problems, we identify the existence of universal priors such that training under these priors yields a near-optimal regret bound of $\widetilde{O}(\frac{1}{n})$ uniformly over all test distributions. Our analysis leverages the classical phenomenon of posterior contraction in Bayesian statistics, showing that the pretrained transformer adapts to unknown test distributions precisely through posterior contraction. This perspective also explains the phenomenon of length generalization, in which the test sequence length exceeds the training length, as the model performs Bayesian inference using a generalized posterior.

翻译：暂无翻译

0

相关内容

人类接受高层次教育、进行原创性研究的场所。现在的大学一般包括一个能授予硕士和博士学位的研究生院和数个专业学院，以及能授予学士学位的一个本科生院。大学还包括高等专科学校

NeurIPS 2025｜从层次化掩码的视角统一并增强 Graph Transformer

NeurIPS 2025｜从层次化掩码的视角统一并增强 Graph Transformer

专知会员服务

9+阅读 · 2025年11月13日

EMNLP 2025 | RTQA：递归思想求解复杂的时间知识图谱问答

EMNLP 2025 | RTQA：递归思想求解复杂的时间知识图谱问答

专知会员服务

12+阅读 · 2025年11月7日

【USC-Aaron Chan博士答辩Slides】可信自然语言处理机器解释的生成与利用, 242页ppt，Generating and Utilizing Machine Explanations for Trustworthy NLP

【USC-Aaron Chan博士答辩Slides】可信自然语言处理机器解释的生成与利用, 242页ppt，Generating and Utilizing Machine Explanations for Trustworthy NLP

专知会员服务

16+阅读 · 2022年3月13日

【AAAI2022】可解释性ViT登场，谷歌AI提出层次嵌套Transformer模型

【AAAI2022】可解释性ViT登场，谷歌AI提出层次嵌套Transformer模型

专知会员服务

29+阅读 · 2022年1月28日

【AAAI2020】拓扑贝叶斯优化与持久性图：Topological Bayesian Optimization with Persistence Diagrams

【AAAI2020】拓扑贝叶斯优化与持久性图：Topological Bayesian Optimization with Persistence Diagrams

专知会员服务

11+阅读 · 2020年1月17日

【KDD2019|讲座推荐】深层贝叶斯挖掘、学习与理解：Deep Bayesian Mining, Learning and Understanding

【KDD2019|讲座推荐】深层贝叶斯挖掘、学习与理解：Deep Bayesian Mining, Learning and Understanding

专知会员服务

65+阅读 · 2019年12月14日

【变分推断课件】Lectures on Variational Inference： Approximate Bayesian Inference in Machine Learning（附带pdf）

【变分推断课件】Lectures on Variational Inference： Approximate Bayesian Inference in Machine Learning（附带pdf）

专知会员服务

35+阅读 · 2019年11月30日

中科院发布最新迁移学习综述论文，带你全面了解40种迁移学习方法

中科院发布最新迁移学习综述论文，带你全面了解40种迁移学习方法

专知会员服务

154+阅读 · 2019年11月19日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

ExBert — 可视化分析Transformer学到的表示

ExBert — 可视化分析Transformer学到的表示

专知会员服务

32+阅读 · 2019年10月16日

300+篇文献！一文详解基于Transformer的多模态学习最新进展

300+篇文献！一文详解基于Transformer的多模态学习最新进展

PaperWeekly

13+阅读 · 2022年7月1日

NLP大牛Thomas Wolf等新书《Transformer自然语言处理》，466页pdf及代码

NLP大牛Thomas Wolf等新书《Transformer自然语言处理》，466页pdf及代码

专知

36+阅读 · 2022年2月7日

【CVPR2021】半监督迁移学习的自适应一致性正则化

【CVPR2021】半监督迁移学习的自适应一致性正则化

专知

41+阅读 · 2021年3月7日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

BERT相关论文、文章和代码资源汇总

BERT相关论文、文章和代码资源汇总

AINLP

19+阅读 · 2018年11月17日

全新视角：用变分推断统一理解生成模型（VAE、GAN、AAE、ALI）

全新视角：用变分推断统一理解生成模型（VAE、GAN、AAE、ALI）

PaperWeekly

15+阅读 · 2018年7月19日

论文浅尝 | Question Answering over Freebase

论文浅尝 | Question Answering over Freebase

开放知识图谱

19+阅读 · 2018年1月9日

可解释的CNN

可解释的CNN

CreateAMind

18+阅读 · 2017年10月5日

新型双组份Camassa-Holm方程的等谱问题及适定性研究

国家自然科学基金

0+阅读 · 2015年12月31日

特殊图的整数流及群连通性问题的研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于贝叶斯观点的分数阶扩散方程反问题研究

国家自然科学基金

0+阅读 · 2015年12月31日

关于全空间上一类Kirchhoff型方程正解的存在性和多重性的研究

国家自然科学基金

0+阅读 · 2015年12月31日

大型异构系统上数百万核可扩展的新型区域分裂隐式求解器研究

国家自然科学基金

0+阅读 · 2015年12月31日

黎曼流形上几类反应扩散方程（组）解的整体存在性

国家自然科学基金

0+阅读 · 2015年12月31日

一类微分半变分不等式问题研究

国家自然科学基金

0+阅读 · 2014年12月31日

高维可压缩非等熵流体力学方程相关问题研究

国家自然科学基金

0+阅读 · 2014年12月31日

复杂非线性椭圆问题研究

国家自然科学基金

0+阅读 · 2014年12月31日

连续介质物理与力学和金融工程中的若干非线性扩散方程问题

国家自然科学基金

0+阅读 · 2014年12月31日

A Bayesian Phase I/II basket design with robust information borrowing to identify subtrial-specific optimal biological doses

Arxiv

0+阅读 · 6月22日

The Pitfall of Scaling Up: Uncovering and Mitigating Popularity Bias Amplification in Scaling Transformer-based Recommenders

Arxiv

0+阅读 · 6月20日

Quasi-Bayes empirical Bayes estimation of sums of random variables

Arxiv

0+阅读 · 6月19日

Conditional neural control variates for variance reduction in Bayesian inverse problems

Arxiv

0+阅读 · 6月19日

ShuffleFlow: Scalable Posterior Inference for Bayesian Inverse Imaging

Arxiv

0+阅读 · 6月19日

Triangular Consistency as a Universal Constraint for Learning Optical Flow

Arxiv

0+阅读 · 6月18日

Dimension reduction of multivariate densities in Bayes spaces

Arxiv

0+阅读 · 6月17日

Bayesian Anytime Pareto Set Identification for Multi-Objective Multi-Armed Bandits

Arxiv

0+阅读 · 6月17日

Active Bayesian Causal Inference

Arxiv

14+阅读 · 2022年10月15日

Bayesian Convolutional Neural Networks

Arxiv

19+阅读 · 2018年6月27日

VIP会员

文章信息

相关主题

贝叶斯推断

最新内容

无人机自主控制与人工智能：系统性综述

无人机自主控制与人工智能：系统性综述

专知会员服务

1+阅读 · 8分钟前

巡飞弹与反无人机系统——现代战场的两大支柱

巡飞弹与反无人机系统——现代战场的两大支柱

专知会员服务

1+阅读 · 39分钟前

《打造“黄金舰队”》57页报告

《打造“黄金舰队”》57页报告

专知会员服务

0+阅读 · 41分钟前

《北约数字教官网络发展路径》128页报告

《北约数字教官网络发展路径》128页报告

专知会员服务

1+阅读 · 刚刚

ECCV 2026 | MIMFlow：MIM与归一化流统一图像生成

ECCV 2026 | MIMFlow：MIM与归一化流统一图像生成

专知会员服务

6+阅读 · 6月25日

超越自回归边界：扩散模型、世界模型与SSM如何重塑代码智能

超越自回归边界：扩散模型、世界模型与SSM如何重塑代码智能

专知会员服务

5+阅读 · 6月25日

重塑决策优势：美军作战艺术与多域作战中联盟联合全域指挥控制（CJADC2）体系的融合

重塑决策优势：美军作战艺术与多域作战中联盟联合全域指挥控制（CJADC2）体系的融合

专知会员服务

7+阅读 · 6月25日

网状网络及其在军事领域的运用

网状网络及其在军事领域的运用

专知会员服务

7+阅读 · 6月25日

《意识即战场——全球安全体系中认知战的演进：乌克兰构建认知作战体系的展望》

《意识即战场——全球安全体系中认知战的演进：乌克兰构建认知作战体系的展望》

专知会员服务

7+阅读 · 6月25日

无美国参与的欧洲战争方式（万字长文）

无美国参与的欧洲战争方式（万字长文）

专知会员服务

8+阅读 · 6月25日

重构“下一场战争”的制胜理论：超越兰彻斯特方程与现代系统

重构“下一场战争”的制胜理论：超越兰彻斯特方程与现代系统

专知会员服务

9+阅读 · 6月25日

《国防工业中基于模型定义的实施：产品定义数字化转型的战略路径》90页

《国防工业中基于模型定义的实施：产品定义数字化转型的战略路径》90页

专知会员服务

9+阅读 · 6月25日

《国防领域敏感性分析白皮书》

《国防领域敏感性分析白皮书》

专知会员服务

8+阅读 · 6月25日

综述 | 从问答到任务完成：Agent系统与Harness设计

综述 | 从问答到任务完成：Agent系统与Harness设计

专知会员服务

9+阅读 · 6月24日

Agentic RL：框架、实践与长程智能体训练

Agentic RL：框架、实践与长程智能体训练

专知会员服务

10+阅读 · 6月24日

相关VIP内容

NeurIPS 2025｜从层次化掩码的视角统一并增强 Graph Transformer

NeurIPS 2025｜从层次化掩码的视角统一并增强 Graph Transformer

专知会员服务

9+阅读 · 2025年11月13日

EMNLP 2025 | RTQA：递归思想求解复杂的时间知识图谱问答

EMNLP 2025 | RTQA：递归思想求解复杂的时间知识图谱问答

专知会员服务

12+阅读 · 2025年11月7日

【USC-Aaron Chan博士答辩Slides】可信自然语言处理机器解释的生成与利用, 242页ppt，Generating and Utilizing Machine Explanations for Trustworthy NLP

【USC-Aaron Chan博士答辩Slides】可信自然语言处理机器解释的生成与利用, 242页ppt，Generating and Utilizing Machine Explanations for Trustworthy NLP

专知会员服务

16+阅读 · 2022年3月13日

【AAAI2022】可解释性ViT登场，谷歌AI提出层次嵌套Transformer模型

【AAAI2022】可解释性ViT登场，谷歌AI提出层次嵌套Transformer模型

专知会员服务

29+阅读 · 2022年1月28日

【AAAI2020】拓扑贝叶斯优化与持久性图：Topological Bayesian Optimization with Persistence Diagrams

【AAAI2020】拓扑贝叶斯优化与持久性图：Topological Bayesian Optimization with Persistence Diagrams

专知会员服务

11+阅读 · 2020年1月17日

【KDD2019|讲座推荐】深层贝叶斯挖掘、学习与理解：Deep Bayesian Mining, Learning and Understanding

【KDD2019|讲座推荐】深层贝叶斯挖掘、学习与理解：Deep Bayesian Mining, Learning and Understanding

专知会员服务

65+阅读 · 2019年12月14日

【变分推断课件】Lectures on Variational Inference： Approximate Bayesian Inference in Machine Learning（附带pdf）

【变分推断课件】Lectures on Variational Inference： Approximate Bayesian Inference in Machine Learning（附带pdf）

专知会员服务

35+阅读 · 2019年11月30日

中科院发布最新迁移学习综述论文，带你全面了解40种迁移学习方法

中科院发布最新迁移学习综述论文，带你全面了解40种迁移学习方法

专知会员服务

154+阅读 · 2019年11月19日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

ExBert — 可视化分析Transformer学到的表示

ExBert — 可视化分析Transformer学到的表示

专知会员服务

32+阅读 · 2019年10月16日

热门VIP内容

开通专知VIP会员享更多权益服务

《打造“黄金舰队”》57页报告

ECCV 2026 | MIMFlow：MIM与归一化流统一图像生成

巡飞弹与反无人机系统——现代战场的两大支柱

《北约数字教官网络发展路径》128页报告

相关资讯

300+篇文献！一文详解基于Transformer的多模态学习最新进展

300+篇文献！一文详解基于Transformer的多模态学习最新进展

PaperWeekly

13+阅读 · 2022年7月1日

NLP大牛Thomas Wolf等新书《Transformer自然语言处理》，466页pdf及代码

NLP大牛Thomas Wolf等新书《Transformer自然语言处理》，466页pdf及代码

专知

36+阅读 · 2022年2月7日

【CVPR2021】半监督迁移学习的自适应一致性正则化

【CVPR2021】半监督迁移学习的自适应一致性正则化

专知

41+阅读 · 2021年3月7日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

BERT相关论文、文章和代码资源汇总

BERT相关论文、文章和代码资源汇总

AINLP

19+阅读 · 2018年11月17日

全新视角：用变分推断统一理解生成模型（VAE、GAN、AAE、ALI）

全新视角：用变分推断统一理解生成模型（VAE、GAN、AAE、ALI）

PaperWeekly

15+阅读 · 2018年7月19日

论文浅尝 | Question Answering over Freebase

论文浅尝 | Question Answering over Freebase

开放知识图谱

19+阅读 · 2018年1月9日

可解释的CNN

可解释的CNN

CreateAMind

18+阅读 · 2017年10月5日

相关论文

A Bayesian Phase I/II basket design with robust information borrowing to identify subtrial-specific optimal biological doses

Arxiv

0+阅读 · 6月22日

The Pitfall of Scaling Up: Uncovering and Mitigating Popularity Bias Amplification in Scaling Transformer-based Recommenders

Arxiv

0+阅读 · 6月20日

Quasi-Bayes empirical Bayes estimation of sums of random variables

Arxiv

0+阅读 · 6月19日

Conditional neural control variates for variance reduction in Bayesian inverse problems

Arxiv

0+阅读 · 6月19日

ShuffleFlow: Scalable Posterior Inference for Bayesian Inverse Imaging

Arxiv

0+阅读 · 6月19日

Triangular Consistency as a Universal Constraint for Learning Optical Flow

Arxiv

0+阅读 · 6月18日

Dimension reduction of multivariate densities in Bayes spaces

Arxiv

0+阅读 · 6月17日

Bayesian Anytime Pareto Set Identification for Multi-Objective Multi-Armed Bandits

Arxiv

0+阅读 · 6月17日

Active Bayesian Causal Inference

Arxiv

14+阅读 · 2022年10月15日

Bayesian Convolutional Neural Networks

Arxiv

19+阅读 · 2018年6月27日

相关基金

新型双组份Camassa-Holm方程的等谱问题及适定性研究

国家自然科学基金

0+阅读 · 2015年12月31日

特殊图的整数流及群连通性问题的研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于贝叶斯观点的分数阶扩散方程反问题研究

国家自然科学基金

0+阅读 · 2015年12月31日

关于全空间上一类Kirchhoff型方程正解的存在性和多重性的研究

国家自然科学基金

0+阅读 · 2015年12月31日

大型异构系统上数百万核可扩展的新型区域分裂隐式求解器研究

国家自然科学基金

0+阅读 · 2015年12月31日

黎曼流形上几类反应扩散方程（组）解的整体存在性

国家自然科学基金

0+阅读 · 2015年12月31日

一类微分半变分不等式问题研究

国家自然科学基金

0+阅读 · 2014年12月31日

高维可压缩非等熵流体力学方程相关问题研究

国家自然科学基金

0+阅读 · 2014年12月31日

复杂非线性椭圆问题研究

国家自然科学基金

0+阅读 · 2014年12月31日

连续介质物理与力学和金融工程中的若干非线性扩散方程问题

国家自然科学基金

0+阅读 · 2014年12月31日

微信扫码咨询专知VIP会员