Program Synthesis for Robot Learning from Demonstrations - 专知论文

会员服务 ·

0

Learning · Performer · 正则表达式 · Alphabet · 机器人 ·

2023 年 5 月 4 日

Program Synthesis for Robot Learning from Demonstrations

翻译：从示范中学习的机器人程序合成方法

Noah Patton,Kia Rahmani,Meghana Missula,Joydeep Biswas,Işil Dillig

from arxiv, 31 Pages, Submitted for Review

This paper presents a new synthesis-based approach for solving the Learning from Demonstration (LfD) problem in robotics. Given a set of user demonstrations, the goal of programmatic LfD is to learn a policy in a programming language that can be used to control a robot's behavior. We address this problem through a novel program synthesis algorithm that leverages two key ideas: First, to perform fast and effective generalization from user demonstrations, our synthesis algorithm views these demonstrations as strings over a finite alphabet and abstracts programs in our DSL as regular expressions over the same alphabet. This regex abstraction facilitates synthesis by helping infer useful program sketches and pruning infeasible parts of the search space. Second, to deal with the large number of object types in the environment, our method leverages a Large Language Model (LLM) to guide search. We have implemented our approach in a tool called Prolex and present the results of a comprehensive experimental evaluation on 120 benchmarks involving 40 unique tasks in three different environments. We show that, given a 120 second time limit, Prolex can find a program consistent with the demonstrations in 80% of the cases. Furthermore, for 81% of the tasks for which a solution is returned, Prolex is able to find the ground truth program with just one demonstration. To put these results in perspective, we conduct a comparison against two baselines and show that both perform much worse.

翻译：本文提出了一种新的基于合成的方法，用于解决机器人学中的从示范中学习（LfD）问题。给定一组用户示范，程序化LfD的目标是学习一种编程语言中的策略，该策略可用于控制机器人的行为。我们通过一种新颖的程序合成算法来解决这个问题，该算法利用了两种关键思想：首先，为了从用户示范中实现快速有效的泛化，我们的合成算法将这些示范视为有限字母表上的字符串，并将领域特定语言中的程序抽象为同一字母表上的正则表达式。这种正则表达式抽象有助于合成，通过帮助推断有用的程序草图并修剪搜索空间中不可行的部分。其次，为了处理环境中大量对象类型，我们的方法利用大型语言模型来引导搜索。我们已在名为Prolex的工具中实现该方法，并在三个不同环境中的40个独特任务上，针对120个基准测试进行了全面实验评估。结果表明，在120秒的时间限制内，Prolex能在80%的情况下找到与示范一致的程序。此外，对于返回解决方案的任务中81%的情况，Prolex仅凭一次示范就能找到真实程序。为将这些结果置于背景下，我们与两个基线方法进行了比较，结果显示两个基线方法的性能都远逊于Prolex。

0

相关内容

Learning

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

37+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

61+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

106+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

Reinforcement Learning: An Introduction 2018第二版 500页

Reinforcement Learning: An Introduction 2018第二版 500页

CreateAMind

14+阅读 · 2018年4月27日

石墨烯基电磁屏蔽纳米复合材料的辐射合成及其性能研究

国家自然科学基金

0+阅读 · 2015年12月31日

铁基双金属/石墨烯的制备及其吸附与可见光Fenton降解染料的性能和机理研究

国家自然科学基金

0+阅读 · 2015年12月31日

无机基有机杂化聚集诱导发光材料的可控制备、构效关系及应用

国家自然科学基金

0+阅读 · 2014年12月31日

极硬纳米孪晶氮化硼的高压合成及其性能研究

国家自然科学基金

0+阅读 · 2014年12月31日

大学生实习失度性研究

国家自然科学基金

1+阅读 · 2014年12月31日

有机物/磷酸盐多组分电解质水溶液与铁系氧化物的表面络合反应机制和模型研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于可逆整型变换与特征分析的图像压缩方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

锆酸铋光催化剂的制备及可见光催化降解有机污染物的研究

国家自然科学基金

0+阅读 · 2013年12月31日

湿式烟气循环氧/燃料燃烧方式下超细颗粒物和典型重金属的排放机理

国家自然科学基金

0+阅读 · 2012年12月31日

2012年全国复分析会议

国家自然科学基金

0+阅读 · 2012年6月18日

Progressive Neural Representation for Sequential Video Compilation

Arxiv

0+阅读 · 2023年6月20日

A VAE Approach to Sample Multivariate Extremes

Arxiv

0+阅读 · 2023年6月19日

Probabilistic matching of real and generated data statistics in generative adversarial networks

Arxiv

0+阅读 · 2023年6月19日

Variational Sequential Optimal Experimental Design using Reinforcement Learning

Arxiv

0+阅读 · 2023年6月17日

Language to Rewards for Robotic Skill Synthesis

Arxiv

0+阅读 · 2023年6月16日

Robotic Packaging Optimization with Reinforcement Learning

Arxiv

0+阅读 · 2023年6月16日

Learning Neural Models for Natural Language Processing in the Face of Distributional Shift

Arxiv

11+阅读 · 2021年9月3日

The Principles of Deep Learning Theory

Arxiv

66+阅读 · 2021年6月18日

Image-to-Image Retrieval by Learning Similarity between Scene Graphs

Arxiv

21+阅读 · 2020年12月29日

A Wholistic View of Continual Learning with Deep Neural Networks: Forgotten Lessons and the Bridge to Active and Open World Learning

Arxiv

36+阅读 · 2020年9月3日

VIP会员

文章信息

相关主题

正则表达式

最新内容

五角大楼新设无人机办公室（DRPM-UxS）将如何重塑美国无人系统格局（附美国防部设立备忘录）

五角大楼新设无人机办公室（DRPM-UxS）将如何重塑美国无人系统格局（附美国防部设立备忘录）

专知会员服务

0+阅读 · 今天8:28

印度精确打击与指挥架构的断层

印度精确打击与指挥架构的断层

专知会员服务

4+阅读 · 7月20日

《NASA喷气推进实验室：高耐久轻质常驻空观测系统（HELIOS）》429页

《NASA喷气推进实验室：高耐久轻质常驻空观测系统（HELIOS）》429页

专知会员服务

6+阅读 · 7月20日

美空军AI完成F-16战斗机自主空战历史性试飞

美空军AI完成F-16战斗机自主空战历史性试飞

专知会员服务

6+阅读 · 7月20日

《美政府问责局——武器系统年度评估（2026年）：强制要求成熟技术或可推动转向快速交付》249页

《美政府问责局——武器系统年度评估（2026年）：强制要求成熟技术或可推动转向快速交付》249页

专知会员服务

6+阅读 · 7月20日

《美国陆军：通过弹性分布式模型库实现自适应AI优势》

《美国陆军：通过弹性分布式模型库实现自适应AI优势》

专知会员服务

4+阅读 · 7月20日

博士论文 | 理解与改进大语言模型推理：从反转诅咒到连续思维链

博士论文 | 理解与改进大语言模型推理：从反转诅咒到连续思维链

专知会员服务

7+阅读 · 7月20日

综述 | 终身视觉表征：持续自监督学习CSSL系统综述

综述 | 终身视觉表征：持续自监督学习CSSL系统综述

专知会员服务

6+阅读 · 7月20日

深入Project Maven：为何人工智能在战场上依然失灵

深入Project Maven：为何人工智能在战场上依然失灵

专知会员服务

14+阅读 · 7月19日

锻造未来士兵：外骨骼、基因工程与赛博格

锻造未来士兵：外骨骼、基因工程与赛博格

专知会员服务

7+阅读 · 7月19日

《无人机系统（UAS）通信网状网络试验性部署》50页报告

《无人机系统（UAS）通信网状网络试验性部署》50页报告

专知会员服务

9+阅读 · 7月19日

《无人机蜂群通信技术研究》50页

《无人机蜂群通信技术研究》50页

专知会员服务

11+阅读 · 7月19日

《基于智能体建模与仿真的无人机蜂群模型目标定位涌现行为比较分析》360页

《基于智能体建模与仿真的无人机蜂群模型目标定位涌现行为比较分析》360页

专知会员服务

15+阅读 · 7月18日

欧洲智能弹药战略创新管理：迈向制导弹药、巡飞系统与自主无人机蜂群的技术主权研究路线图

欧洲智能弹药战略创新管理：迈向制导弹药、巡飞系统与自主无人机蜂群的技术主权研究路线图

专知会员服务

8+阅读 · 7月18日

从领域适配到部署与可解释：Berkeley博士论文解析大语言模型真实落地

从领域适配到部署与可解释：Berkeley博士论文解析大语言模型真实落地

专知会员服务

16+阅读 · 7月18日

相关VIP内容

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

37+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

61+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

106+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

印度精确打击与指挥架构的断层

美空军AI完成F-16战斗机自主空战历史性试飞

五角大楼新设无人机办公室（DRPM-UxS）将如何重塑美国无人系统格局（附美国防部设立备忘录）

《NASA喷气推进实验室：高耐久轻质常驻空观测系统（HELIOS）》429页

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

Reinforcement Learning: An Introduction 2018第二版 500页

Reinforcement Learning: An Introduction 2018第二版 500页

CreateAMind

14+阅读 · 2018年4月27日

相关论文

Progressive Neural Representation for Sequential Video Compilation

Arxiv

0+阅读 · 2023年6月20日

A VAE Approach to Sample Multivariate Extremes

Arxiv

0+阅读 · 2023年6月19日

Probabilistic matching of real and generated data statistics in generative adversarial networks

Arxiv

0+阅读 · 2023年6月19日

Variational Sequential Optimal Experimental Design using Reinforcement Learning

Arxiv

0+阅读 · 2023年6月17日

Language to Rewards for Robotic Skill Synthesis

Arxiv

0+阅读 · 2023年6月16日

Robotic Packaging Optimization with Reinforcement Learning

Arxiv

0+阅读 · 2023年6月16日

Learning Neural Models for Natural Language Processing in the Face of Distributional Shift

Arxiv

11+阅读 · 2021年9月3日

The Principles of Deep Learning Theory

Arxiv

66+阅读 · 2021年6月18日

Image-to-Image Retrieval by Learning Similarity between Scene Graphs

Arxiv

21+阅读 · 2020年12月29日

A Wholistic View of Continual Learning with Deep Neural Networks: Forgotten Lessons and the Bridge to Active and Open World Learning

Arxiv

36+阅读 · 2020年9月3日

相关基金

石墨烯基电磁屏蔽纳米复合材料的辐射合成及其性能研究

国家自然科学基金

0+阅读 · 2015年12月31日

铁基双金属/石墨烯的制备及其吸附与可见光Fenton降解染料的性能和机理研究

国家自然科学基金

0+阅读 · 2015年12月31日

无机基有机杂化聚集诱导发光材料的可控制备、构效关系及应用

国家自然科学基金

0+阅读 · 2014年12月31日

极硬纳米孪晶氮化硼的高压合成及其性能研究

国家自然科学基金

0+阅读 · 2014年12月31日

大学生实习失度性研究

国家自然科学基金

1+阅读 · 2014年12月31日

有机物/磷酸盐多组分电解质水溶液与铁系氧化物的表面络合反应机制和模型研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于可逆整型变换与特征分析的图像压缩方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

锆酸铋光催化剂的制备及可见光催化降解有机污染物的研究

国家自然科学基金

0+阅读 · 2013年12月31日

湿式烟气循环氧/燃料燃烧方式下超细颗粒物和典型重金属的排放机理

国家自然科学基金

0+阅读 · 2012年12月31日

2012年全国复分析会议

国家自然科学基金

0+阅读 · 2012年6月18日

微信扫码咨询专知VIP会员