STAP: Sequencing Task-Agnostic Policies - 专知论文

会员服务 ·

0

估计/估计量 · 可行 · 可约的 · HTTPS · 真实值 ·

2023 年 5 月 31 日

STAP: Sequencing Task-Agnostic Policies

翻译：STAP：序列任务无关策略

Christopher Agia,Toki Migimatsu,Jiajun Wu,Jeannette Bohg

from arxiv, Video: https://drive.google.com/file/d/1zp3qFeZLACNPsGLLP7p6q9X1tuA_PGEo/view. Project page: https://sites.google.com/stanford.edu/stap. 12 pages, 7 figures. In proceedings of the IEEE International Conference on Robotics and Automation (ICRA) 2023. The first two authors contributed equally

Advances in robotic skill acquisition have made it possible to build general-purpose libraries of learned skills for downstream manipulation tasks. However, naively executing these skills one after the other is unlikely to succeed without accounting for dependencies between actions prevalent in long-horizon plans. We present Sequencing Task-Agnostic Policies (STAP), a scalable framework for training manipulation skills and coordinating their geometric dependencies at planning time to solve long-horizon tasks never seen by any skill during training. Given that Q-functions encode a measure of skill feasibility, we formulate an optimization problem to maximize the joint success of all skills sequenced in a plan, which we estimate by the product of their Q-values. Our experiments indicate that this objective function approximates ground truth plan feasibility and, when used as a planning objective, reduces myopic behavior and thereby promotes long-horizon task success. We further demonstrate how STAP can be used for task and motion planning by estimating the geometric feasibility of skill sequences provided by a task planner. We evaluate our approach in simulation and on a real robot. Qualitative results and code are made available at https://sites.google.com/stanford.edu/stap.

翻译：在机器人技能获取方面的进展使得构建用于下游操作任务的通用型习得技能库成为可能。然而，若未考虑长时域规划中动作间的依赖关系，简单顺序执行这些技能往往难以成功。我们提出序列任务无关策略（STAP），这是一个可扩展框架，用于训练操作技能并协调其在规划时的几何依赖关系，以解决训练中任何技能均未见过的长时域任务。鉴于Q函数编码了技能可行性的度量，我们构建了一个优化问题，旨在最大化规划中所有序列技能的联合成功率，并通过其Q值的乘积进行估计。实验表明，该目标函数近似于真实规划可行性，将其作为规划目标可减少短视行为，从而提升长时域任务成功率。我们进一步展示了如何利用STAP进行任务与运动规划：通过估计任务规划器提供的技能序列的几何可行性。我们在仿真环境和真实机器人上评估了该方法。定性结果和代码已发布于 https://sites.google.com/stanford.edu/stap。

0

相关内容

估计/估计量

估计/估计量

百篇论文纵览大型语言模型最新研究进展

百篇论文纵览大型语言模型最新研究进展

专知会员服务

70+阅读 · 2023年3月31日

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

84+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新5篇信息抽取（IE）相关论文—开放信息抽取、不完整信息、主动学习、越南语、依存分析

【论文推荐】最新5篇信息抽取（IE）相关论文—开放信息抽取、不完整信息、主动学习、越南语、依存分析

专知

12+阅读 · 2018年2月2日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

Anderson型多酸的不对称修饰及可控组装研究

国家自然科学基金

1+阅读 · 2014年12月31日

两亲性可降解导电聚合物及其可注射水凝胶的制备与性能研究

国家自然科学基金

0+阅读 · 2013年12月31日

99mTc标记树状大分子包裹金纳米颗粒偶联Duramycin对肿瘤化疗诱导细胞凋亡的分子影像学研究

国家自然科学基金

0+阅读 · 2013年12月31日

HIC1调控CIITA转录机制研究及其在B细胞分化中的意义

国家自然科学基金

0+阅读 · 2012年12月31日

电化学合成铂基介孔纳米薄膜的新原理和新方法及其电催化性能调控

国家自然科学基金

0+阅读 · 2012年12月31日

ABCB6基因在眼缺损发病机制中的功能研究

国家自然科学基金

0+阅读 · 2012年12月31日

microRNA 在Snail/Twist诱导乳腺癌细胞发生上皮-间质转化中的表达调控及生物学功能

国家自然科学基金

0+阅读 · 2011年12月31日

CIECAM02拓展研究

国家自然科学基金

0+阅读 · 2011年12月31日

三元复合物中siRNA结合-释放和细胞依赖性摄取及胞内运输与体内外转染效率的相关性

国家自然科学基金

0+阅读 · 2011年12月31日

探寻与高功能孤独症和Asperger综合征相关的拷贝数变异

国家自然科学基金

0+阅读 · 2009年12月31日

Learning minimal representations of stochastic processes with variational autoencoders

Arxiv

0+阅读 · 2023年7月21日

Bayesian taut splines for estimating the number of modes

Arxiv

0+阅读 · 2023年7月21日

Introducing Delays in Multi-Agent Path Finding

Arxiv

0+阅读 · 2023年7月20日

Context-Conditional Navigation with a Learning-Based Terrain- and Robot-Aware Dynamics Model

Arxiv

0+阅读 · 2023年7月20日

Planning with Dynamically Estimated Action Costs

Arxiv

0+阅读 · 2023年7月19日

Retrieving Continuous Time Event Sequences using Neural Temporal Point Processes with Learnable Hashing

Arxiv

0+阅读 · 2023年7月13日

Autonomous Drone Racing: A Survey

Arxiv

27+阅读 · 2023年1月5日

K-AID: Enhancing Pre-trained Language Models with Domain Knowledge for Question Answering

Arxiv

15+阅读 · 2021年9月22日

Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting

Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting

Arxiv

21+阅读 · 2020年12月17日

Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning

Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning

Arxiv

34+阅读 · 2019年10月24日

VIP会员

文章信息

相关主题

估计/估计量

最新内容

ICML 2026 | 自回归Boltzmann生成器重塑分子采样

ICML 2026 | 自回归Boltzmann生成器重塑分子采样

专知会员服务

0+阅读 · 今天15:55

GNN跨域综述：从消息传递到图基础模型

GNN跨域综述：从消息传递到图基础模型

专知会员服务

0+阅读 · 今天15:53

无人机自主控制与人工智能：系统性综述

无人机自主控制与人工智能：系统性综述

专知会员服务

11+阅读 · 今天7:25

巡飞弹与反无人机系统——现代战场的两大支柱

巡飞弹与反无人机系统——现代战场的两大支柱

专知会员服务

3+阅读 · 今天6:54

《打造“黄金舰队”》57页报告

《打造“黄金舰队”》57页报告

专知会员服务

3+阅读 · 今天6:52

《北约数字教官网络发展路径》128页报告

《北约数字教官网络发展路径》128页报告

专知会员服务

2+阅读 · 今天6:33

ECCV 2026 | MIMFlow：MIM与归一化流统一图像生成

ECCV 2026 | MIMFlow：MIM与归一化流统一图像生成

专知会员服务

7+阅读 · 6月25日

超越自回归边界：扩散模型、世界模型与SSM如何重塑代码智能

超越自回归边界：扩散模型、世界模型与SSM如何重塑代码智能

专知会员服务

6+阅读 · 6月25日

重塑决策优势：美军作战艺术与多域作战中联盟联合全域指挥控制（CJADC2）体系的融合

重塑决策优势：美军作战艺术与多域作战中联盟联合全域指挥控制（CJADC2）体系的融合

专知会员服务

10+阅读 · 6月25日

网状网络及其在军事领域的运用

网状网络及其在军事领域的运用

专知会员服务

8+阅读 · 6月25日

《意识即战场——全球安全体系中认知战的演进：乌克兰构建认知作战体系的展望》

《意识即战场——全球安全体系中认知战的演进：乌克兰构建认知作战体系的展望》

专知会员服务

8+阅读 · 6月25日

无美国参与的欧洲战争方式（万字长文）

无美国参与的欧洲战争方式（万字长文）

专知会员服务

8+阅读 · 6月25日

重构“下一场战争”的制胜理论：超越兰彻斯特方程与现代系统

重构“下一场战争”的制胜理论：超越兰彻斯特方程与现代系统

专知会员服务

10+阅读 · 6月25日

《国防工业中基于模型定义的实施：产品定义数字化转型的战略路径》90页

《国防工业中基于模型定义的实施：产品定义数字化转型的战略路径》90页

专知会员服务

9+阅读 · 6月25日

《国防领域敏感性分析白皮书》

《国防领域敏感性分析白皮书》

专知会员服务

9+阅读 · 6月25日

相关VIP内容

百篇论文纵览大型语言模型最新研究进展

百篇论文纵览大型语言模型最新研究进展

专知会员服务

70+阅读 · 2023年3月31日

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

84+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

GNN跨域综述：从消息传递到图基础模型

巡飞弹与反无人机系统——现代战场的两大支柱

ICML 2026 | 自回归Boltzmann生成器重塑分子采样

无人机自主控制与人工智能：系统性综述

相关资讯

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新5篇信息抽取（IE）相关论文—开放信息抽取、不完整信息、主动学习、越南语、依存分析

【论文推荐】最新5篇信息抽取（IE）相关论文—开放信息抽取、不完整信息、主动学习、越南语、依存分析

专知

12+阅读 · 2018年2月2日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Learning minimal representations of stochastic processes with variational autoencoders

Arxiv

0+阅读 · 2023年7月21日

Bayesian taut splines for estimating the number of modes

Arxiv

0+阅读 · 2023年7月21日

Introducing Delays in Multi-Agent Path Finding

Arxiv

0+阅读 · 2023年7月20日

Context-Conditional Navigation with a Learning-Based Terrain- and Robot-Aware Dynamics Model

Arxiv

0+阅读 · 2023年7月20日

Planning with Dynamically Estimated Action Costs

Arxiv

0+阅读 · 2023年7月19日

Retrieving Continuous Time Event Sequences using Neural Temporal Point Processes with Learnable Hashing

Arxiv

0+阅读 · 2023年7月13日

Autonomous Drone Racing: A Survey

Arxiv

27+阅读 · 2023年1月5日

K-AID: Enhancing Pre-trained Language Models with Domain Knowledge for Question Answering

Arxiv

15+阅读 · 2021年9月22日

Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting

Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting

Arxiv

21+阅读 · 2020年12月17日

Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning

Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning

Arxiv

34+阅读 · 2019年10月24日

相关基金

Anderson型多酸的不对称修饰及可控组装研究

国家自然科学基金

1+阅读 · 2014年12月31日

两亲性可降解导电聚合物及其可注射水凝胶的制备与性能研究

国家自然科学基金

0+阅读 · 2013年12月31日

99mTc标记树状大分子包裹金纳米颗粒偶联Duramycin对肿瘤化疗诱导细胞凋亡的分子影像学研究

国家自然科学基金

0+阅读 · 2013年12月31日

HIC1调控CIITA转录机制研究及其在B细胞分化中的意义

国家自然科学基金

0+阅读 · 2012年12月31日

电化学合成铂基介孔纳米薄膜的新原理和新方法及其电催化性能调控

国家自然科学基金

0+阅读 · 2012年12月31日

ABCB6基因在眼缺损发病机制中的功能研究

国家自然科学基金

0+阅读 · 2012年12月31日

microRNA 在Snail/Twist诱导乳腺癌细胞发生上皮-间质转化中的表达调控及生物学功能

国家自然科学基金

0+阅读 · 2011年12月31日

CIECAM02拓展研究

国家自然科学基金

0+阅读 · 2011年12月31日

三元复合物中siRNA结合-释放和细胞依赖性摄取及胞内运输与体内外转染效率的相关性

国家自然科学基金

0+阅读 · 2011年12月31日

探寻与高功能孤独症和Asperger综合征相关的拷贝数变异

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员