通过前瞻性觅食实现未来最优控制 (Optimal Control of the Future via Prospective Foraging) - 专知论文

会员服务 ·

0

最优 · PAC学习理论 · PAC学习 · 非平稳 · 非平稳环境 ·

2025 年 11 月 11 日

Optimal Control of the Future via Prospective Foraging

翻译：通过前瞻性觅食实现未来最优控制

Yuxin Bai,Aranyak Acharyya,Ashwin De Silva,Zeyu Shen,James Hassett,Joshua T. Vogelstein

Optimal control of the future is the next frontier for AI. Current approaches to this problem are typically rooted in either reinforcement learning or online learning. While powerful, these frameworks for learning are mathematically distinct from Probably Approximately Correct (PAC) learning, which has been the workhorse for the recent technological achievements in AI. We therefore build on the prior work of prospective learning, an extension of PAC learning (without control) in non-stationary environments (De Silva et al., 2023; Silva et al., 2024; Bai et al., 2026). Here, we further extend the PAC learning framework to address learning and control in non-stationary environments. Using this framework, called ''Prospective Control'', we prove that under certain fairly general assumptions, empirical risk minimization (ERM) asymptotically achieves the Bayes optimal policy. We then consider a specific instance of prospective control, foraging, which is a canonical task for any mobile agent, be it natural or artificial. We illustrate that existing reinforcement learning algorithms fail to learn in these non-stationary environments, and even with modifications, they are orders of magnitude less efficient than our prospective foraging agents. Code is available at: https://github.com/neurodata/ProspectiveLearningwithControl.

翻译：未来的最优控制是人工智能的下一个前沿领域。当前解决该问题的方法通常基于强化学习或在线学习。尽管这些学习框架功能强大，但它们在数学上与“可能近似正确”（PAC）学习截然不同，而PAC学习正是推动近期人工智能技术成就的核心工具。因此，我们基于先前的前瞻性学习研究（De Silva等人，2023；Silva等人，2024；Bai等人，2026）展开工作，该研究是PAC学习（无控制）在非平稳环境中的扩展。在此，我们进一步扩展PAC学习框架，以解决非平稳环境中的学习与控制问题。利用这一名为“前瞻控制”的框架，我们证明在若干相当一般的假设下，经验风险最小化（ERM）能够渐近地实现贝叶斯最优策略。随后，我们考察了前瞻控制的一个具体实例——觅食，这是任何移动智能体（无论是自然还是人工的）的典型任务。我们阐明，现有的强化学习算法在这些非平稳环境中无法有效学习，即使经过修改，其效率也远低于我们的前瞻性觅食智能体（相差数个数量级）。代码可在以下网址获取：https://github.com/neurodata/ProspectiveLearningwithControl。

0

相关内容

【CVPR2025】CarPlanner: 一种用于自动驾驶大规模强化学习的一致性自回归轨迹规划

【CVPR2025】CarPlanner: 一种用于自动驾驶大规模强化学习的一致性自回归轨迹规划

专知会员服务

14+阅读 · 2025年3月2日

OpenAI GPT 4.5 报告（中英文版）

OpenAI GPT 4.5 报告（中英文版）

专知会员服务

40+阅读 · 2025年3月1日

【ICML2024】悲观遇上风险：风险敏感的离线强化学习

【ICML2024】悲观遇上风险：风险敏感的离线强化学习

专知会员服务

25+阅读 · 2024年7月11日

【ICML2023】面向决策Transformer的未来条件无监督预训练

【ICML2023】面向决策Transformer的未来条件无监督预训练

专知会员服务

44+阅读 · 2023年5月30日

【ICML2023】多任务分层对抗逆强化学习

【ICML2023】多任务分层对抗逆强化学习

专知会员服务

22+阅读 · 2023年5月25日

【NeurIPS2022】持续强化学习中的解纠缠迁移

【NeurIPS2022】持续强化学习中的解纠缠迁移

专知会员服务

27+阅读 · 2022年10月3日

【CVPR2022】提示分布学习

【CVPR2022】提示分布学习

专知会员服务

31+阅读 · 2022年5月17日

【MIla】一种意识启发规划的基于模型强化学习，A Consciousness-Inspired Planning Agent for Model-Based Reinforcement Learning

【MIla】一种意识启发规划的基于模型强化学习，A Consciousness-Inspired Planning Agent for Model-Based Reinforcement Learning

专知会员服务

24+阅读 · 2022年3月19日

【Google AI-Luong】无标记数据学习, 83ppt, learning from Unlabeled Data

【Google AI-Luong】无标记数据学习, 83ppt, learning from Unlabeled Data

专知会员服务

91+阅读 · 2020年3月5日

【Facebook AI】对抗性NLI:自然语言理解的新基准，Adversarial NLI: A New Benchmark for Natural Language Understanding

【Facebook AI】对抗性NLI:自然语言理解的新基准，Adversarial NLI: A New Benchmark for Natural Language Understanding

专知会员服务

11+阅读 · 2019年11月2日

推荐！【DARPA终身学习机器（L2M）】《自主系统中用于感知和行动的终身学习》美空军、宾大2022最新234页技术报告

推荐！【DARPA终身学习机器（L2M）】《自主系统中用于感知和行动的终身学习》美空军、宾大2022最新234页技术报告

专知

26+阅读 · 2022年11月24日

时空数据挖掘:综述

时空数据挖掘:综述

专知

36+阅读 · 2022年6月30日

【CVPR 2020 Oral】小样本类增量学习

【CVPR 2020 Oral】小样本类增量学习

专知

20+阅读 · 2020年6月26日

Pytorch多模态框架MMF

Pytorch多模态框架MMF

专知

50+阅读 · 2020年6月20日

图机器学习 2.2-2.4 Properties of Networks, Random Graph

图机器学习 2.2-2.4 Properties of Networks, Random Graph

图与推荐

10+阅读 · 2020年3月28日

【Amazon】使用预训练Transformer模型进行数据增强

【Amazon】使用预训练Transformer模型进行数据增强

专知

12+阅读 · 2020年3月6日

Graph Neural Networks 综述

Graph Neural Networks 综述

计算机视觉life

30+阅读 · 2019年8月13日

Auto-Keras与AutoML：入门指南

Auto-Keras与AutoML：入门指南

云栖社区

18+阅读 · 2019年2月9日

DeepMind：用PopArt进行多任务深度强化学习

DeepMind：用PopArt进行多任务深度强化学习

论智

29+阅读 · 2018年9月14日

读论文Discriminative Deep Metric Learning for Face and KV

读论文Discriminative Deep Metric Learning for Face and KV

统计学习与视觉计算组

12+阅读 · 2018年4月6日

基于深层特征学习的RGB-D人体行为识别方法

国家自然科学基金

4+阅读 · 2015年12月31日

分布式有监督学习的学习理论

国家自然科学基金

17+阅读 · 2015年12月31日

社交网络中的流言传播与演化

国家自然科学基金

2+阅读 · 2015年12月31日

基于自主学习的Ad hoc Agent序贯决策研究

国家自然科学基金

46+阅读 · 2015年12月31日

高维数据下的模型平均方法

国家自然科学基金

6+阅读 · 2014年12月31日

复杂多元数据的半参数统计推断

国家自然科学基金

5+阅读 · 2014年12月31日

面向时空变化的GIS数据模型

国家自然科学基金

6+阅读 · 2014年12月31日

反问题的数学建模、计算及应用

国家自然科学基金

2+阅读 · 2014年12月31日

面向汉语文本理解的语义计算方法

国家自然科学基金

8+阅读 · 2014年12月31日

基于融合先验知识的机器学习的多传感器融合研究

国家自然科学基金

16+阅读 · 2013年12月31日

Is ChatGPT a Good Recommender? A Preliminary Study

Arxiv

175+阅读 · 2023年4月20日

NeuralField-LDM: Scene Generation with Hierarchical Latent Diffusion Models

Arxiv

42+阅读 · 2023年4月19日

On Efficient Training of Large-Scale Deep Learning Models: A Literature Review

Arxiv

231+阅读 · 2023年4月7日

A Survey on Graph Diffusion Models: Generative AI in Science for Molecule, Protein and Material

Arxiv

87+阅读 · 2023年4月4日

A Survey of Large Language Models

A Survey of Large Language Models

Arxiv

499+阅读 · 2023年3月31日

Unleashing the Power of Edge-Cloud Generative AI in Mobile Networks: A Survey of AIGC Services

Arxiv

154+阅读 · 2023年3月29日

Knowledge Graphs: Opportunities and Challenges

Arxiv

181+阅读 · 2023年3月24日

Sparks of Artificial General Intelligence: Early experiments with GPT-4

Arxiv

51+阅读 · 2023年3月22日

A Complete Survey on Generative AI (AIGC): Is ChatGPT from GPT-4 to GPT-5 All You Need?

Arxiv

88+阅读 · 2023年3月21日

Data-centric Artificial Intelligence: A Survey

Arxiv

27+阅读 · 2023年3月17日

VIP会员

文章信息

相关主题

PAC学习理论

非平稳环境

相关VIP内容

【CVPR2025】CarPlanner: 一种用于自动驾驶大规模强化学习的一致性自回归轨迹规划

【CVPR2025】CarPlanner: 一种用于自动驾驶大规模强化学习的一致性自回归轨迹规划

专知会员服务

14+阅读 · 2025年3月2日

OpenAI GPT 4.5 报告（中英文版）

OpenAI GPT 4.5 报告（中英文版）

专知会员服务

40+阅读 · 2025年3月1日

【ICML2024】悲观遇上风险：风险敏感的离线强化学习

【ICML2024】悲观遇上风险：风险敏感的离线强化学习

专知会员服务

25+阅读 · 2024年7月11日

【ICML2023】面向决策Transformer的未来条件无监督预训练

【ICML2023】面向决策Transformer的未来条件无监督预训练

专知会员服务

44+阅读 · 2023年5月30日

【ICML2023】多任务分层对抗逆强化学习

【ICML2023】多任务分层对抗逆强化学习

专知会员服务

22+阅读 · 2023年5月25日

【NeurIPS2022】持续强化学习中的解纠缠迁移

【NeurIPS2022】持续强化学习中的解纠缠迁移

专知会员服务

27+阅读 · 2022年10月3日

【CVPR2022】提示分布学习

【CVPR2022】提示分布学习

专知会员服务

31+阅读 · 2022年5月17日

【MIla】一种意识启发规划的基于模型强化学习，A Consciousness-Inspired Planning Agent for Model-Based Reinforcement Learning

【MIla】一种意识启发规划的基于模型强化学习，A Consciousness-Inspired Planning Agent for Model-Based Reinforcement Learning

专知会员服务

24+阅读 · 2022年3月19日

【Google AI-Luong】无标记数据学习, 83ppt, learning from Unlabeled Data

【Google AI-Luong】无标记数据学习, 83ppt, learning from Unlabeled Data

专知会员服务

91+阅读 · 2020年3月5日

【Facebook AI】对抗性NLI:自然语言理解的新基准，Adversarial NLI: A New Benchmark for Natural Language Understanding

【Facebook AI】对抗性NLI:自然语言理解的新基准，Adversarial NLI: A New Benchmark for Natural Language Understanding

专知会员服务

11+阅读 · 2019年11月2日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】基于自适应表征的高效视觉建模

《多域作战中融合网络、电子战与动能机动》

AI智能体时代大模型安全风险与攻防新挑战

迈向个性化大语言模型驱动的智能体：基础、评估与未来方向

相关资讯

推荐！【DARPA终身学习机器（L2M）】《自主系统中用于感知和行动的终身学习》美空军、宾大2022最新234页技术报告

推荐！【DARPA终身学习机器（L2M）】《自主系统中用于感知和行动的终身学习》美空军、宾大2022最新234页技术报告

专知

26+阅读 · 2022年11月24日

时空数据挖掘:综述

时空数据挖掘:综述

专知

36+阅读 · 2022年6月30日

【CVPR 2020 Oral】小样本类增量学习

【CVPR 2020 Oral】小样本类增量学习

专知

20+阅读 · 2020年6月26日

Pytorch多模态框架MMF

Pytorch多模态框架MMF

专知

50+阅读 · 2020年6月20日

图机器学习 2.2-2.4 Properties of Networks, Random Graph

图机器学习 2.2-2.4 Properties of Networks, Random Graph

图与推荐

10+阅读 · 2020年3月28日

【Amazon】使用预训练Transformer模型进行数据增强

【Amazon】使用预训练Transformer模型进行数据增强

专知

12+阅读 · 2020年3月6日

Graph Neural Networks 综述

Graph Neural Networks 综述

计算机视觉life

30+阅读 · 2019年8月13日

Auto-Keras与AutoML：入门指南

Auto-Keras与AutoML：入门指南

云栖社区

18+阅读 · 2019年2月9日

DeepMind：用PopArt进行多任务深度强化学习

DeepMind：用PopArt进行多任务深度强化学习

论智

29+阅读 · 2018年9月14日

读论文Discriminative Deep Metric Learning for Face and KV

读论文Discriminative Deep Metric Learning for Face and KV

统计学习与视觉计算组

12+阅读 · 2018年4月6日

相关论文

Is ChatGPT a Good Recommender? A Preliminary Study

Arxiv

175+阅读 · 2023年4月20日

NeuralField-LDM: Scene Generation with Hierarchical Latent Diffusion Models

Arxiv

42+阅读 · 2023年4月19日

On Efficient Training of Large-Scale Deep Learning Models: A Literature Review

Arxiv

231+阅读 · 2023年4月7日

A Survey on Graph Diffusion Models: Generative AI in Science for Molecule, Protein and Material

Arxiv

87+阅读 · 2023年4月4日

A Survey of Large Language Models

A Survey of Large Language Models

Arxiv

499+阅读 · 2023年3月31日

Unleashing the Power of Edge-Cloud Generative AI in Mobile Networks: A Survey of AIGC Services

Arxiv

154+阅读 · 2023年3月29日

Knowledge Graphs: Opportunities and Challenges

Arxiv

181+阅读 · 2023年3月24日

Sparks of Artificial General Intelligence: Early experiments with GPT-4

Arxiv

51+阅读 · 2023年3月22日

A Complete Survey on Generative AI (AIGC): Is ChatGPT from GPT-4 to GPT-5 All You Need?

Arxiv

88+阅读 · 2023年3月21日

Data-centric Artificial Intelligence: A Survey

Arxiv

27+阅读 · 2023年3月17日

相关基金

基于深层特征学习的RGB-D人体行为识别方法

国家自然科学基金

4+阅读 · 2015年12月31日

分布式有监督学习的学习理论

国家自然科学基金

17+阅读 · 2015年12月31日

社交网络中的流言传播与演化

国家自然科学基金

2+阅读 · 2015年12月31日

基于自主学习的Ad hoc Agent序贯决策研究

国家自然科学基金

46+阅读 · 2015年12月31日

高维数据下的模型平均方法

国家自然科学基金

6+阅读 · 2014年12月31日

复杂多元数据的半参数统计推断

国家自然科学基金

5+阅读 · 2014年12月31日

面向时空变化的GIS数据模型

国家自然科学基金

6+阅读 · 2014年12月31日

反问题的数学建模、计算及应用

国家自然科学基金

2+阅读 · 2014年12月31日

面向汉语文本理解的语义计算方法

国家自然科学基金

8+阅读 · 2014年12月31日

基于融合先验知识的机器学习的多传感器融合研究

国家自然科学基金

16+阅读 · 2013年12月31日

微信扫码咨询专知VIP会员