Learning Rational Subgoals from Demonstrations and Instructions - 专知论文

会员服务 ·

0

Learning · binary · Pair · 回合 · Agent ·

2023 年 3 月 9 日

Learning Rational Subgoals from Demonstrations and Instructions

翻译：从演示与指令中学习理性子目标

Zhezheng Luo,Jiayuan Mao,Jiajun Wu,Tomás Lozano-Pérez,Joshua B. Tenenbaum,Leslie Pack Kaelbling

from arxiv, AAAI 2023. First two authors contributed equally. Project page: https://rsg.csail.mit.edu

We present a framework for learning useful subgoals that support efficient long-term planning to achieve novel goals. At the core of our framework is a collection of rational subgoals (RSGs), which are essentially binary classifiers over the environmental states. RSGs can be learned from weakly-annotated data, in the form of unsegmented demonstration trajectories, paired with abstract task descriptions, which are composed of terms initially unknown to the agent (e.g., collect-wood then craft-boat then go-across-river). Our framework also discovers dependencies between RSGs, e.g., the task collect-wood is a helpful subgoal for the task craft-boat. Given a goal description, the learned subgoals and the derived dependencies facilitate off-the-shelf planning algorithms, such as A* and RRT, by setting helpful subgoals as waypoints to the planner, which significantly improves performance-time efficiency.

翻译：我们提出一个用于学习有效子目标的框架，这些子目标能够支持高效长期规划以实现新目标。该框架的核心是一组理性子目标（RSGs），本质上是对环境状态的二元分类器。RSGs可通过弱标注数据（形式为未分割的演示轨迹）结合抽象任务描述进行学习，其中任务描述包含智能体初始未知的术语（例如：collect-wood → craft-boat → go-across-river）。我们的框架还能发现RSGs之间的依赖关系，例如任务collect-wood是任务craft-boat的有效子目标。当给定目标描述时，学习到的子目标及其衍生依赖关系可通过将有效子目标设为路径点来支持A*和RRT等现成规划算法，从而显著提升运行时效能。

0

相关内容

Learning

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

37+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

60+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

164+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

106+阅读 · 2019年10月9日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新八篇生成对抗网络相关论文—BRE、图像合成、多模态图像生成、非配对多域图、注意力、对抗特征增强、深度对抗性训练

【论文推荐】最新八篇生成对抗网络相关论文—BRE、图像合成、多模态图像生成、非配对多域图、注意力、对抗特征增强、深度对抗性训练

专知

16+阅读 · 2018年5月14日

可解释的CNN

可解释的CNN

CreateAMind

18+阅读 · 2017年10月5日

基于结构单元探测与修复的车载LiDAR数据建筑物立面模型三维重建研究

国家自然科学基金

0+阅读 · 2015年12月31日

NOD2介导的自噬在糖尿病肾病肾小管损伤中的作用及机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于压缩感知的高动态范围图像采集存储与重现方法的研究

国家自然科学基金

0+阅读 · 2014年12月31日

神经系统seipin缺失诱发精神迟滞的分子机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

长链非编码RNA CAR intergenic 10在细胞衰老中的作用和机制

国家自然科学基金

1+阅读 · 2013年12月31日

髓过氧化物酶氧化的高密度脂蛋白对血管平滑肌细胞增殖、迁移功能的影响研究

国家自然科学基金

0+阅读 · 2013年12月31日

β-Sarcoglycan在mSOD1介导ALS骨骼肌病变中的机制研究

国家自然科学基金

1+阅读 · 2012年12月31日

Skp2-p27信号通路在卵巢早衰发病中的作用机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

液相法制备钒酸铋光催化剂及其光催化活性增强机理的研究

国家自然科学基金

0+阅读 · 2011年12月31日

PTEN-PI3K-Akt信号通路及其下游基因FOXO3A在卵巢早衰发病中的作用

国家自然科学基金

0+阅读 · 2009年12月31日

Strategic Classification with Graph Neural Networks

Arxiv

0+阅读 · 2023年5月1日

3D shape reconstruction of semi-transparent worms

Arxiv

0+阅读 · 2023年4月28日

Using Both Demonstrations and Language Instructions to Efficiently Learn Robotic Tasks

Arxiv

0+阅读 · 2023年4月28日

Learning Soft Constraints From Constrained Expert Demonstrations

Arxiv

0+阅读 · 2023年4月27日

LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions

Arxiv

0+阅读 · 2023年4月27日

Inferring Preferences from Demonstrations in Multi-objective Reinforcement Learning: A Dynamic Weight-based Approach

Arxiv

0+阅读 · 2023年4月27日

Multimodality Representation Learning: A Survey on Evolution, Pretraining and Its Applications

Arxiv

20+阅读 · 2023年2月1日

Adaptive Synthetic Characters for Military Training

Adaptive Synthetic Characters for Military Training

Arxiv

50+阅读 · 2021年1月6日

Financial Time Series Representation Learning

Financial Time Series Representation Learning

Arxiv

10+阅读 · 2020年3月27日

Attention U-Net: Learning Where to Look for the Pancreas

Arxiv

17+阅读 · 2018年5月20日

VIP会员

文章信息

相关主题

最新内容

《高分辨率模拟下的聚合战斗建模：以“会战交锋”场景为例》

《高分辨率模拟下的聚合战斗建模：以“会战交锋”场景为例》

专知会员服务

0+阅读 · 5分钟前

《人机协同在安全关键型操作决策中的应用》120页

《人机协同在安全关键型操作决策中的应用》120页

专知会员服务

0+阅读 · 14分钟前

网络防御与空中力量网络防护：21世纪空中力量历史与理论的启示

网络防御与空中力量网络防护：21世纪空中力量历史与理论的启示

专知会员服务

0+阅读 · 今天1:47

综述 | Memory for Large Language Models：大模型记忆机制全景

综述 | Memory for Large Language Models：大模型记忆机制全景

专知会员服务

2+阅读 · 7月29日

博士论文 | Riemannian Deep Learning：模块、网络与几何

博士论文 | Riemannian Deep Learning：模块、网络与几何

专知会员服务

1+阅读 · 7月29日

《越野作战环境下路径规划的多准则整数规划模型》

《越野作战环境下路径规划的多准则整数规划模型》

专知会员服务

9+阅读 · 7月29日

人工智能大语言模型引擎如何重塑全球冲突信息环境最新50页

人工智能大语言模型引擎如何重塑全球冲突信息环境最新50页

专知会员服务

6+阅读 · 7月29日

《防空系统对自主武器系统辩论中“有意义的人类控制”的启示》70页报告

《防空系统对自主武器系统辩论中“有意义的人类控制”的启示》70页报告

专知会员服务

5+阅读 · 7月29日

“对标ChatGPT”：乌军研发Marichka AI系统用于战场筹划

“对标ChatGPT”：乌军研发Marichka AI系统用于战场筹划

专知会员服务

10+阅读 · 7月29日

《同步多无人机系统中的故障与通信》

《同步多无人机系统中的故障与通信》

专知会员服务

4+阅读 · 7月29日

论文解读 | 医学图像修复中的扩散模型：挑战、分类与未来方向

论文解读 | 医学图像修复中的扩散模型：挑战、分类与未来方向

专知会员服务

4+阅读 · 7月28日

博士论文 | 从算法到基础模型：强化学习的统一视角

博士论文 | 从算法到基础模型：强化学习的统一视角

专知会员服务

10+阅读 · 7月28日

面向国防作战的最佳自主与蜂群无人机技术

面向国防作战的最佳自主与蜂群无人机技术

专知会员服务

7+阅读 · 7月28日

《异构人类团队的协作决策过程混合建模研究》

《异构人类团队的协作决策过程混合建模研究》

专知会员服务

8+阅读 · 7月28日

《C5ISR系统中的注意力动态与自适应决策支持研究：视觉与多模态注意力引导对任务绩效影响的递归量化分析》最新36页报告

《C5ISR系统中的注意力动态与自适应决策支持研究：视觉与多模态注意力引导对任务绩效影响的递归量化分析》最新36页报告

专知会员服务

8+阅读 · 7月28日

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

37+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

60+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

164+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

106+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

综述 | Memory for Large Language Models：大模型记忆机制全景

《越野作战环境下路径规划的多准则整数规划模型》

网络防御与空中力量网络防护：21世纪空中力量历史与理论的启示

博士论文 | Riemannian Deep Learning：模块、网络与几何

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新八篇生成对抗网络相关论文—BRE、图像合成、多模态图像生成、非配对多域图、注意力、对抗特征增强、深度对抗性训练

【论文推荐】最新八篇生成对抗网络相关论文—BRE、图像合成、多模态图像生成、非配对多域图、注意力、对抗特征增强、深度对抗性训练

专知

16+阅读 · 2018年5月14日

可解释的CNN

可解释的CNN

CreateAMind

18+阅读 · 2017年10月5日

相关论文

Strategic Classification with Graph Neural Networks

Arxiv

0+阅读 · 2023年5月1日

3D shape reconstruction of semi-transparent worms

Arxiv

0+阅读 · 2023年4月28日

Using Both Demonstrations and Language Instructions to Efficiently Learn Robotic Tasks

Arxiv

0+阅读 · 2023年4月28日

Learning Soft Constraints From Constrained Expert Demonstrations

Arxiv

0+阅读 · 2023年4月27日

LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions

Arxiv

0+阅读 · 2023年4月27日

Inferring Preferences from Demonstrations in Multi-objective Reinforcement Learning: A Dynamic Weight-based Approach

Arxiv

0+阅读 · 2023年4月27日

Multimodality Representation Learning: A Survey on Evolution, Pretraining and Its Applications

Arxiv

20+阅读 · 2023年2月1日

Adaptive Synthetic Characters for Military Training

Adaptive Synthetic Characters for Military Training

Arxiv

50+阅读 · 2021年1月6日

Financial Time Series Representation Learning

Financial Time Series Representation Learning

Arxiv

10+阅读 · 2020年3月27日

Attention U-Net: Learning Where to Look for the Pancreas

Arxiv

17+阅读 · 2018年5月20日

相关基金

基于结构单元探测与修复的车载LiDAR数据建筑物立面模型三维重建研究

国家自然科学基金

0+阅读 · 2015年12月31日

NOD2介导的自噬在糖尿病肾病肾小管损伤中的作用及机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于压缩感知的高动态范围图像采集存储与重现方法的研究

国家自然科学基金

0+阅读 · 2014年12月31日

神经系统seipin缺失诱发精神迟滞的分子机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

长链非编码RNA CAR intergenic 10在细胞衰老中的作用和机制

国家自然科学基金

1+阅读 · 2013年12月31日

髓过氧化物酶氧化的高密度脂蛋白对血管平滑肌细胞增殖、迁移功能的影响研究

国家自然科学基金

0+阅读 · 2013年12月31日

β-Sarcoglycan在mSOD1介导ALS骨骼肌病变中的机制研究

国家自然科学基金

1+阅读 · 2012年12月31日

Skp2-p27信号通路在卵巢早衰发病中的作用机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

液相法制备钒酸铋光催化剂及其光催化活性增强机理的研究

国家自然科学基金

0+阅读 · 2011年12月31日

PTEN-PI3K-Akt信号通路及其下游基因FOXO3A在卵巢早衰发病中的作用

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员