Learning Modular Robot Visual-motor Locomotion Policies - 专知论文

会员服务 ·

0

Learning · 控制器 · 机器人 · 设计 · INTERACT ·

2023 年 4 月 29 日

Learning Modular Robot Visual-motor Locomotion Policies

翻译：学习模块化机器人视觉-运动步态策略

Julian Whitman,Howie Choset

Control policy learning for modular robot locomotion has previously been limited to proprioceptive feedback and flat terrain. This paper develops policies for modular systems with vision traversing more challenging environments. These modular robots can be reconfigured to form many different designs, where each design needs a controller to function. Though one could create a policy for individual designs and environments, such an approach is not scalable given the wide range of potential designs and environments. To address this challenge, we create a visual-motor policy that can generalize to both new designs and environments. The policy itself is modular, in that it is divided into components, each of which corresponds to a type of module (e.g., a leg, wheel, or body). The policy components can be recombined during training to learn to control multiple designs. We develop a deep reinforcement learning algorithm where visual observations are input to a modular policy interacting with multiple environments at once. We apply this algorithm to train robots with combinations of legs and wheels, then demonstrate the policy controlling real robots climbing stairs and curbs.

翻译：模块化机器人步态控制策略的学习此前仅限于本体感觉反馈和平坦地形。本文针对配备视觉系统的模块化机器人在更具挑战性环境中的穿越问题开发了相关策略。这些模块化机器人可重构为多种不同构型，每种构型都需要控制器才能运作。虽然可以为特定构型和环境单独创建策略，但面对潜在的大量构型与环境组合，这种方法的可扩展性有限。为应对这一挑战，我们提出了一种能泛化至新构型和新环境的视觉-运动策略。该策略本身具有模块化特性，即被拆分为多个组件，每个组件对应一种模块类型（例如腿、轮子或主体）。训练过程中策略组件可重新组合，以学习控制多种构型。我们开发了一种深度强化学习算法，使视觉观测输入能够与同时与多个环境交互的模块化策略相结合。应用该算法训练了具有腿-轮组合构型的机器人，并演示了该策略如何控制真实机器人攀爬楼梯和路缘。

0

相关内容

Learning

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

128+阅读 · 2022年4月21日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

37+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

60+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

164+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

80+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

【泡泡前沿追踪】跟踪SLAM前沿动态系列之IROS2018

【泡泡前沿追踪】跟踪SLAM前沿动态系列之IROS2018

泡泡机器人SLAM

29+阅读 · 2018年10月28日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

TRAF3IP3调控T细胞活性与肿瘤免疫的分子机制

国家自然科学基金

0+阅读 · 2016年12月31日

PCAF干扰PGC-1a转录活性对2型糖尿病小鼠肝糖异生调控作用的研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于人眼关注度与情感分析的电子商务智能推荐计算

国家自然科学基金

0+阅读 · 2014年12月31日

p38 MAPK/ATF2信号通路对BACE1表达和Aβ生成的调控作用及其分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

序列的几种复杂度及其关系研究

国家自然科学基金

1+阅读 · 2013年12月31日

两类迁移扩散方程组的若干问题研究

国家自然科学基金

0+阅读 · 2013年12月31日

考虑观测值时空相关性的InSAR三维形变估计方法

国家自然科学基金

0+阅读 · 2013年12月31日

层次贝叶斯模型中隐性变量分布的非参数估计及在RNA-seq数据中的应用

国家自然科学基金

1+阅读 · 2013年12月31日

Intraflagellar Transport运输纤毛蛋白的分子机理

国家自然科学基金

0+阅读 · 2012年12月31日

姜黄素通过抑制MMP-2 酶切TRAIL促进DcR1阳性肺腺癌凋亡

国家自然科学基金

0+阅读 · 2012年12月31日

Multi-Robot Motion Planning: A Learning-Based Artificial Potential Field Solution

Arxiv

0+阅读 · 2023年6月13日

DeepTransition: Viability Leads to the Emergence of Gait Transitions in Learning Anticipatory Quadrupedal Locomotion Skills

Arxiv

0+阅读 · 2023年6月12日

Robust Reinforcement Learning through Efficient Adversarial Herding

Arxiv

0+阅读 · 2023年6月12日

Maximising Coefficiency of Human-Robot Handovers through Reinforcement Learning

Arxiv

0+阅读 · 2023年6月12日

Enabling Spatial Digital Twins: Technologies, Challenges, and Future Research Directions

Arxiv

0+阅读 · 2023年6月11日

Divide and Repair: Using Options to Improve Performance of Imitation Learning Against Adversarial Demonstrations

Arxiv

0+阅读 · 2023年6月9日

Near-optimal Conservative Exploration in Reinforcement Learning under Episode-wise Constraints

Arxiv

0+阅读 · 2023年6月9日

Ada-NAV: Adaptive Trajectory-Based Sample Efficient Policy Learning for Robotic Navigation

Arxiv

0+阅读 · 2023年6月9日

Reinforcement Learning based Air Combat Maneuver Generation

Reinforcement Learning based Air Combat Maneuver Generation

Arxiv

92+阅读 · 2022年1月14日

Learning Robust Policy against Disturbance in Transition Dynamics via State-Conservative Policy Optimization

Arxiv

14+阅读 · 2021年12月20日

VIP会员

文章信息

相关主题

最新内容

对抗环境下超视距目标打击的情报支援

对抗环境下超视距目标打击的情报支援

专知会员服务

3+阅读 · 今天14:49

《面向复杂地形下无人机跟踪地面机器人（UAV–UGV）的自适应多滤波器扩展卡尔曼滤波框架》

《面向复杂地形下无人机跟踪地面机器人（UAV–UGV）的自适应多滤波器扩展卡尔曼滤波框架》

专知会员服务

1+阅读 · 今天14:25

纵深侦察：大规模作战行动中远程侦察与监视之迫切需求

纵深侦察：大规模作战行动中远程侦察与监视之迫切需求

专知会员服务

2+阅读 · 今天13:57

共享认知，分布式研判：复杂行动中的美国空军指挥控制（万字长文）

共享认知，分布式研判：复杂行动中的美国空军指挥控制（万字长文）

专知会员服务

2+阅读 · 今天13:27

《无人机对海面作战影响评估》

《无人机对海面作战影响评估》

专知会员服务

11+阅读 · 7月21日

《可损耗无人系统规模化应用对美国军事转型的战略影响（2022-2030）》2026年270页

《可损耗无人系统规模化应用对美国军事转型的战略影响（2022-2030）》2026年270页

专知会员服务

10+阅读 · 7月21日

博士论文 | 后训练如何损害大模型生成多样性？SimpleStrat与Stylus

博士论文 | 后训练如何损害大模型生成多样性？SimpleStrat与Stylus

专知会员服务

4+阅读 · 7月21日

综述 | 面向5G/6G网络的LLM智能体AI：架构、协议与标准化

综述 | 面向5G/6G网络的LLM智能体AI：架构、协议与标准化

专知会员服务

6+阅读 · 7月21日

五角大楼新设无人机办公室（DRPM-UxS）将如何重塑美国无人系统格局（附美国防部设立备忘录）

五角大楼新设无人机办公室（DRPM-UxS）将如何重塑美国无人系统格局（附美国防部设立备忘录）

专知会员服务

8+阅读 · 7月21日

印度精确打击与指挥架构的断层

印度精确打击与指挥架构的断层

专知会员服务

6+阅读 · 7月20日

《NASA喷气推进实验室：高耐久轻质常驻空观测系统（HELIOS）》429页

《NASA喷气推进实验室：高耐久轻质常驻空观测系统（HELIOS）》429页

专知会员服务

8+阅读 · 7月20日

美空军AI完成F-16战斗机自主空战历史性试飞

美空军AI完成F-16战斗机自主空战历史性试飞

专知会员服务

6+阅读 · 7月20日

《美政府问责局——武器系统年度评估（2026年）：强制要求成熟技术或可推动转向快速交付》249页

《美政府问责局——武器系统年度评估（2026年）：强制要求成熟技术或可推动转向快速交付》249页

专知会员服务

9+阅读 · 7月20日

《美国陆军：通过弹性分布式模型库实现自适应AI优势》

《美国陆军：通过弹性分布式模型库实现自适应AI优势》

专知会员服务

8+阅读 · 7月20日

博士论文 | 理解与改进大语言模型推理：从反转诅咒到连续思维链

博士论文 | 理解与改进大语言模型推理：从反转诅咒到连续思维链

专知会员服务

10+阅读 · 7月20日

相关VIP内容

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

128+阅读 · 2022年4月21日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

37+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

60+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

164+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

80+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《面向复杂地形下无人机跟踪地面机器人（UAV–UGV）的自适应多滤波器扩展卡尔曼滤波框架》

共享认知，分布式研判：复杂行动中的美国空军指挥控制（万字长文）

对抗环境下超视距目标打击的情报支援

纵深侦察：大规模作战行动中远程侦察与监视之迫切需求

相关资讯

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

【泡泡前沿追踪】跟踪SLAM前沿动态系列之IROS2018

【泡泡前沿追踪】跟踪SLAM前沿动态系列之IROS2018

泡泡机器人SLAM

29+阅读 · 2018年10月28日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Multi-Robot Motion Planning: A Learning-Based Artificial Potential Field Solution

Arxiv

0+阅读 · 2023年6月13日

DeepTransition: Viability Leads to the Emergence of Gait Transitions in Learning Anticipatory Quadrupedal Locomotion Skills

Arxiv

0+阅读 · 2023年6月12日

Robust Reinforcement Learning through Efficient Adversarial Herding

Arxiv

0+阅读 · 2023年6月12日

Maximising Coefficiency of Human-Robot Handovers through Reinforcement Learning

Arxiv

0+阅读 · 2023年6月12日

Enabling Spatial Digital Twins: Technologies, Challenges, and Future Research Directions

Arxiv

0+阅读 · 2023年6月11日

Divide and Repair: Using Options to Improve Performance of Imitation Learning Against Adversarial Demonstrations

Arxiv

0+阅读 · 2023年6月9日

Near-optimal Conservative Exploration in Reinforcement Learning under Episode-wise Constraints

Arxiv

0+阅读 · 2023年6月9日

Ada-NAV: Adaptive Trajectory-Based Sample Efficient Policy Learning for Robotic Navigation

Arxiv

0+阅读 · 2023年6月9日

Reinforcement Learning based Air Combat Maneuver Generation

Reinforcement Learning based Air Combat Maneuver Generation

Arxiv

92+阅读 · 2022年1月14日

Learning Robust Policy against Disturbance in Transition Dynamics via State-Conservative Policy Optimization

Arxiv

14+阅读 · 2021年12月20日

相关基金

TRAF3IP3调控T细胞活性与肿瘤免疫的分子机制

国家自然科学基金

0+阅读 · 2016年12月31日

PCAF干扰PGC-1a转录活性对2型糖尿病小鼠肝糖异生调控作用的研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于人眼关注度与情感分析的电子商务智能推荐计算

国家自然科学基金

0+阅读 · 2014年12月31日

p38 MAPK/ATF2信号通路对BACE1表达和Aβ生成的调控作用及其分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

序列的几种复杂度及其关系研究

国家自然科学基金

1+阅读 · 2013年12月31日

两类迁移扩散方程组的若干问题研究

国家自然科学基金

0+阅读 · 2013年12月31日

考虑观测值时空相关性的InSAR三维形变估计方法

国家自然科学基金

0+阅读 · 2013年12月31日

层次贝叶斯模型中隐性变量分布的非参数估计及在RNA-seq数据中的应用

国家自然科学基金

1+阅读 · 2013年12月31日

Intraflagellar Transport运输纤毛蛋白的分子机理

国家自然科学基金

0+阅读 · 2012年12月31日

姜黄素通过抑制MMP-2 酶切TRAIL促进DcR1阳性肺腺癌凋亡

国家自然科学基金

0+阅读 · 2012年12月31日

微信扫码咨询专知VIP会员