BiPreManip: Learning Affordance-Based Bimanual Preparatory Manipulation through Anticipatory Collaboration - 专知论文

会员服务 ·

0

操作 · 可操作性 · 协作 · iPad · 泛化 ·

BiPreManip: Learning Affordance-Based Bimanual Preparatory Manipulation through Anticipatory Collaboration

翻译：BiPreManip: 基于可操作性的双臂预备操作学习：通过预期性协作

Yan Shen,Feng Jiang,Zichen He,Xiaoqi Li,Yuchen Liu,Zhiyu Li,Ruihai Wu,Hao Dong

from arxiv, Accepted to CVPR 2026

Many everyday objects are difficult to directly grasp (e.g., a flat iPad) or manipulate functionally (e.g., opening the cap of a pen lying on a desk). Such tasks require sequential, asymmetric coordination between two arms, where one arm performs preparatory manipulation that enables the other's goal-directed action - for instance, pushing the iPad to the table's edge before picking it up, or lifting the pen body to allow the other hand to remove its cap. In this work, we introduce Collaborative Preparatory Manipulation, a class of bimanual manipulation tasks that demand understanding object semantics and geometry, anticipating spatial relationships, and planning long-horizon coordinated actions between the two arms. To tackle this challenge, we propose a visual affordance-based framework that first envisions the final goal-directed action and then guides one arm to perform a sequence of preparatory manipulations that facilitate the other arm's subsequent operation. This affordance-centric representation enables anticipatory inter-arm reasoning and coordination, generalizing effectively across various objects spanning diverse categories. Extensive experiments in both simulation and the real world demonstrate that our approach substantially improves task success rates and generalization compared to competitive baselines.

翻译：许多日常物体难以直接抓取（例如平放的iPad）或进行功能性操作（例如打开桌上笔的笔帽）。这类任务要求双臂之间进行顺序性、非对称的协调，其中一只手臂执行预备性操作，为另一只手臂的目标导向动作创造条件——例如，将iPad推到桌边再拾起，或抬起笔身以便另一只手取下笔帽。本文提出"协作性预备操作"这一概念，这是一类需要理解物体语义与几何结构、预判空间关系、并规划双臂间长时程协调动作的双臂操作任务。为应对这一挑战，我们提出基于视觉可操作性的框架：首先构想最终目标导向动作，然后引导一只手臂执行一系列预备操作，以促进另一只手臂的后续操作。这种以可操作性为核心的表示方法实现了双臂间的预期性推理与协调，并能有效泛化至不同类别的多种物体。在仿真与真实环境中的大量实验表明，与竞争性基线方法相比，我们的方法显著提升了任务成功率与泛化能力。

0

相关内容

综述 | 机器人操作世界模型：预测、行动接口与学习生命周期

综述 | 机器人操作世界模型：预测、行动接口与学习生命周期

专知会员服务

11+阅读 · 6月3日

ICML 2025 | BiAssemble: 双臂机器人几何拼合问题的协同可供性学习

ICML 2025 | BiAssemble: 双臂机器人几何拼合问题的协同可供性学习

专知会员服务

11+阅读 · 2025年7月15日

GRAPH-BERT ：学习图表示只需要注意力，GRAPH-BERT : Only Attention is Needed for Learning Graph Representations

GRAPH-BERT ：学习图表示只需要注意力，GRAPH-BERT : Only Attention is Needed for Learning Graph Representations

专知会员服务

78+阅读 · 2020年5月31日

【斯坦福】探究预训练语言模型中的可迁移性，Investigating Transferability in PLM

【斯坦福】探究预训练语言模型中的可迁移性，Investigating Transferability in PLM

专知会员服务

20+阅读 · 2020年5月3日

【微软亚研】预训练文本表示作为元学习，Pre-training Text Representations

【微软亚研】预训练文本表示作为元学习，Pre-training Text Representations

专知会员服务

40+阅读 · 2020年4月17日

【CVPR2020-亚马逊】后向兼容表示学习，BackwardCompatible RepresentationLearning

【CVPR2020-亚马逊】后向兼容表示学习，BackwardCompatible RepresentationLearning

专知会员服务

13+阅读 · 2020年3月27日

【Google-MIT-哥伦比亚-ICRA2020】先看后学:操作前的视觉训练，Visual Pre-training

【Google-MIT-哥伦比亚-ICRA2020】先看后学:操作前的视觉训练，Visual Pre-training

专知会员服务

15+阅读 · 2020年3月21日

【谷歌大脑新论文】利用可微摄动优化器进行学习，Learning with Differentiable Perturbed Optimizers

【谷歌大脑新论文】利用可微摄动优化器进行学习，Learning with Differentiable Perturbed Optimizers

专知会员服务

29+阅读 · 2020年2月22日

【NTU Peng Xu】徒手素描的深度学习:综述论文，Deep Learning for Free-Hand Sketch

【NTU Peng Xu】徒手素描的深度学习:综述论文，Deep Learning for Free-Hand Sketch

专知会员服务

18+阅读 · 2020年1月9日

【斯坦福大学】具有共同注意力的对抗性跨域动作识别（Adversarial Cross-Domain Action Recognition with Co-Attention）

【斯坦福大学】具有共同注意力的对抗性跨域动作识别（Adversarial Cross-Domain Action Recognition with Co-Attention）

专知会员服务

38+阅读 · 2019年12月26日

《基于多智能体深度强化学习的空战模拟智能体协作》瑞典林雪平大学

《基于多智能体深度强化学习的空战模拟智能体协作》瑞典林雪平大学

专知

66+阅读 · 2022年8月25日

【CVPR2020-台大】透视眼：学会透过障碍物看东西，Learning to See Through Obstructions

【CVPR2020-台大】透视眼：学会透过障碍物看东西，Learning to See Through Obstructions

专知

26+阅读 · 2020年4月3日

【CVPR2020-牛津-谷歌】语音到动作:动作识别的跨模态监督，Cross-modal Supervision

【CVPR2020-牛津-谷歌】语音到动作:动作识别的跨模态监督，Cross-modal Supervision

专知

10+阅读 · 2020年3月31日

【Uber AI新论文】持续元学习，Learning to Continually Learn

【Uber AI新论文】持续元学习，Learning to Continually Learn

专知

19+阅读 · 2020年2月27日

【加州理工】什么是模仿学习(Imitation Learning（模仿学习), 这62页ppt带你了解进展，附下载

【加州理工】什么是模仿学习(Imitation Learning（模仿学习), 这62页ppt带你了解进展，附下载

专知

21+阅读 · 2019年11月14日

【伯克利PNAS最新论文】可解释机器学习的定义、方法和应用

【伯克利PNAS最新论文】可解释机器学习的定义、方法和应用

专知

78+阅读 · 2019年10月20日

BAM！利用知识蒸馏和多任务学习构建的通用语言模型

BAM！利用知识蒸馏和多任务学习构建的通用语言模型

机器之心

15+阅读 · 2019年3月18日

BERT-预训练的强大

BERT-预训练的强大

微信AI

61+阅读 · 2019年3月7日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

【干货】基于注意力机制的神经匹配模型用于短文本检索

【干货】基于注意力机制的神经匹配模型用于短文本检索

专知

11+阅读 · 2018年1月11日

针对大规模环境下复杂任务的策略搜索强化学习方法研究

国家自然科学基金

43+阅读 · 2015年12月31日

防肌肉疲劳双臂机器人人机协同基础研究

国家自然科学基金

1+阅读 · 2015年12月31日

儿童手写运动促进中英文感知的认知神经机制

国家自然科学基金

0+阅读 · 2015年12月31日

仿人轻型机械臂人机协作模式关键技术研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于多个小型微惯性/磁强计测量单元的手势识别研究

国家自然科学基金

0+阅读 · 2015年12月31日

机器灵巧手基于触滑觉信息协同的自适应力控制方法研究

国家自然科学基金

3+阅读 · 2015年12月31日

基于记忆学习与免疫系统的仿生控制研究

国家自然科学基金

7+阅读 · 2015年12月31日

局部可视环境中基于视觉和触觉感知的灵巧手精细操作的方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

基于深度信息和显著计算的手势交互技术研究及应用

国家自然科学基金

1+阅读 · 2014年12月31日

周期性手工装配作业肌肉疲劳预测建模与其装配质量改善

国家自然科学基金

0+阅读 · 2014年12月31日

VLBiMan: Vision-Language Anchored One-Shot Demonstration Enables Generalizable Bimanual Robotic Manipulation

Arxiv

0+阅读 · 5月1日

One-Shot Real-World Demonstration Synthesis for Scalable Bimanual Manipulation

Arxiv

0+阅读 · 4月27日

GenerativeMPC: VLM-RAG-guided Whole-Body MPC with Virtual Impedance for Bimanual Mobile Manipulation

Arxiv

0+阅读 · 4月21日

DART: Learning-Enhanced Model Predictive Control for Dual-Arm Non-Prehensile Manipulation

Arxiv

0+阅读 · 4月20日

BiCoord: A Bimanual Manipulation Benchmark towards Long-Horizon Spatial-Temporal Coordination

Arxiv

0+阅读 · 4月7日

Learning Geometry-Aware Nonprehensile Pushing and Pulling with Dexterous Hands

Arxiv

0+阅读 · 4月6日

Concurrent Prehensile and Nonprehensile Manipulation: A Practical Approach to Multi-Stage Dexterous Tasks

Arxiv

0+阅读 · 3月12日

EquiBim: Learning Symmetry-Equivariant Policy for Bimanual Manipulation

Arxiv

0+阅读 · 3月9日

Unified Learning of Temporal Task Structure and Action Timing for Bimanual Robot Manipulation

Arxiv

0+阅读 · 3月6日

TwinVLA: Data-Efficient Bimanual Manipulation with Twin Single-Arm Vision-Language-Action Models

Arxiv

0+阅读 · 2月23日

VIP会员

文章信息

相关主题

最新内容

印度精确打击与指挥架构的断层

印度精确打击与指挥架构的断层

专知会员服务

4+阅读 · 7月20日

《NASA喷气推进实验室：高耐久轻质常驻空观测系统（HELIOS）》429页

《NASA喷气推进实验室：高耐久轻质常驻空观测系统（HELIOS）》429页

专知会员服务

5+阅读 · 7月20日

美空军AI完成F-16战斗机自主空战历史性试飞

美空军AI完成F-16战斗机自主空战历史性试飞

专知会员服务

5+阅读 · 7月20日

《美政府问责局——武器系统年度评估（2026年）：强制要求成熟技术或可推动转向快速交付》249页

《美政府问责局——武器系统年度评估（2026年）：强制要求成熟技术或可推动转向快速交付》249页

专知会员服务

5+阅读 · 7月20日

《美国陆军：通过弹性分布式模型库实现自适应AI优势》

《美国陆军：通过弹性分布式模型库实现自适应AI优势》

专知会员服务

3+阅读 · 7月20日

博士论文 | 理解与改进大语言模型推理：从反转诅咒到连续思维链

博士论文 | 理解与改进大语言模型推理：从反转诅咒到连续思维链

专知会员服务

5+阅读 · 7月20日

综述 | 终身视觉表征：持续自监督学习CSSL系统综述

综述 | 终身视觉表征：持续自监督学习CSSL系统综述

专知会员服务

5+阅读 · 7月20日

深入Project Maven：为何人工智能在战场上依然失灵

深入Project Maven：为何人工智能在战场上依然失灵

专知会员服务

14+阅读 · 7月19日

锻造未来士兵：外骨骼、基因工程与赛博格

锻造未来士兵：外骨骼、基因工程与赛博格

专知会员服务

7+阅读 · 7月19日

《无人机系统（UAS）通信网状网络试验性部署》50页报告

《无人机系统（UAS）通信网状网络试验性部署》50页报告

专知会员服务

7+阅读 · 7月19日

《无人机蜂群通信技术研究》50页

《无人机蜂群通信技术研究》50页

专知会员服务

8+阅读 · 7月19日

《基于智能体建模与仿真的无人机蜂群模型目标定位涌现行为比较分析》360页

《基于智能体建模与仿真的无人机蜂群模型目标定位涌现行为比较分析》360页

专知会员服务

12+阅读 · 7月18日

欧洲智能弹药战略创新管理：迈向制导弹药、巡飞系统与自主无人机蜂群的技术主权研究路线图

欧洲智能弹药战略创新管理：迈向制导弹药、巡飞系统与自主无人机蜂群的技术主权研究路线图

专知会员服务

8+阅读 · 7月18日

从领域适配到部署与可解释：Berkeley博士论文解析大语言模型真实落地

从领域适配到部署与可解释：Berkeley博士论文解析大语言模型真实落地

专知会员服务

13+阅读 · 7月18日

综述 | 长程智能体研究全景：基础、演化、框架、优化与前沿

综述 | 长程智能体研究全景：基础、演化、框架、优化与前沿

专知会员服务

10+阅读 · 7月18日

相关VIP内容

综述 | 机器人操作世界模型：预测、行动接口与学习生命周期

综述 | 机器人操作世界模型：预测、行动接口与学习生命周期

专知会员服务

11+阅读 · 6月3日

ICML 2025 | BiAssemble: 双臂机器人几何拼合问题的协同可供性学习

ICML 2025 | BiAssemble: 双臂机器人几何拼合问题的协同可供性学习

专知会员服务

11+阅读 · 2025年7月15日

GRAPH-BERT ：学习图表示只需要注意力，GRAPH-BERT : Only Attention is Needed for Learning Graph Representations

GRAPH-BERT ：学习图表示只需要注意力，GRAPH-BERT : Only Attention is Needed for Learning Graph Representations

专知会员服务

78+阅读 · 2020年5月31日

【斯坦福】探究预训练语言模型中的可迁移性，Investigating Transferability in PLM

【斯坦福】探究预训练语言模型中的可迁移性，Investigating Transferability in PLM

专知会员服务

20+阅读 · 2020年5月3日

【微软亚研】预训练文本表示作为元学习，Pre-training Text Representations

【微软亚研】预训练文本表示作为元学习，Pre-training Text Representations

专知会员服务

40+阅读 · 2020年4月17日

【CVPR2020-亚马逊】后向兼容表示学习，BackwardCompatible RepresentationLearning

【CVPR2020-亚马逊】后向兼容表示学习，BackwardCompatible RepresentationLearning

专知会员服务

13+阅读 · 2020年3月27日

【Google-MIT-哥伦比亚-ICRA2020】先看后学:操作前的视觉训练，Visual Pre-training

【Google-MIT-哥伦比亚-ICRA2020】先看后学:操作前的视觉训练，Visual Pre-training

专知会员服务

15+阅读 · 2020年3月21日

【谷歌大脑新论文】利用可微摄动优化器进行学习，Learning with Differentiable Perturbed Optimizers

【谷歌大脑新论文】利用可微摄动优化器进行学习，Learning with Differentiable Perturbed Optimizers

专知会员服务

29+阅读 · 2020年2月22日

【NTU Peng Xu】徒手素描的深度学习:综述论文，Deep Learning for Free-Hand Sketch

【NTU Peng Xu】徒手素描的深度学习:综述论文，Deep Learning for Free-Hand Sketch

专知会员服务

18+阅读 · 2020年1月9日

【斯坦福大学】具有共同注意力的对抗性跨域动作识别（Adversarial Cross-Domain Action Recognition with Co-Attention）

【斯坦福大学】具有共同注意力的对抗性跨域动作识别（Adversarial Cross-Domain Action Recognition with Co-Attention）

专知会员服务

38+阅读 · 2019年12月26日

热门VIP内容

开通专知VIP会员享更多权益服务

《NASA喷气推进实验室：高耐久轻质常驻空观测系统（HELIOS）》429页

《美政府问责局——武器系统年度评估（2026年）：强制要求成熟技术或可推动转向快速交付》249页

印度精确打击与指挥架构的断层

美空军AI完成F-16战斗机自主空战历史性试飞

相关资讯

《基于多智能体深度强化学习的空战模拟智能体协作》瑞典林雪平大学

《基于多智能体深度强化学习的空战模拟智能体协作》瑞典林雪平大学

专知

66+阅读 · 2022年8月25日

【CVPR2020-台大】透视眼：学会透过障碍物看东西，Learning to See Through Obstructions

【CVPR2020-台大】透视眼：学会透过障碍物看东西，Learning to See Through Obstructions

专知

26+阅读 · 2020年4月3日

【CVPR2020-牛津-谷歌】语音到动作:动作识别的跨模态监督，Cross-modal Supervision

【CVPR2020-牛津-谷歌】语音到动作:动作识别的跨模态监督，Cross-modal Supervision

专知

10+阅读 · 2020年3月31日

【Uber AI新论文】持续元学习，Learning to Continually Learn

【Uber AI新论文】持续元学习，Learning to Continually Learn

专知

19+阅读 · 2020年2月27日

【加州理工】什么是模仿学习(Imitation Learning（模仿学习), 这62页ppt带你了解进展，附下载

【加州理工】什么是模仿学习(Imitation Learning（模仿学习), 这62页ppt带你了解进展，附下载

专知

21+阅读 · 2019年11月14日

【伯克利PNAS最新论文】可解释机器学习的定义、方法和应用

【伯克利PNAS最新论文】可解释机器学习的定义、方法和应用

专知

78+阅读 · 2019年10月20日

BAM！利用知识蒸馏和多任务学习构建的通用语言模型

BAM！利用知识蒸馏和多任务学习构建的通用语言模型

机器之心

15+阅读 · 2019年3月18日

BERT-预训练的强大

BERT-预训练的强大

微信AI

61+阅读 · 2019年3月7日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

【干货】基于注意力机制的神经匹配模型用于短文本检索

【干货】基于注意力机制的神经匹配模型用于短文本检索

专知

11+阅读 · 2018年1月11日

相关论文

VLBiMan: Vision-Language Anchored One-Shot Demonstration Enables Generalizable Bimanual Robotic Manipulation

Arxiv

0+阅读 · 5月1日

One-Shot Real-World Demonstration Synthesis for Scalable Bimanual Manipulation

Arxiv

0+阅读 · 4月27日

GenerativeMPC: VLM-RAG-guided Whole-Body MPC with Virtual Impedance for Bimanual Mobile Manipulation

Arxiv

0+阅读 · 4月21日

DART: Learning-Enhanced Model Predictive Control for Dual-Arm Non-Prehensile Manipulation

Arxiv

0+阅读 · 4月20日

BiCoord: A Bimanual Manipulation Benchmark towards Long-Horizon Spatial-Temporal Coordination

Arxiv

0+阅读 · 4月7日

Learning Geometry-Aware Nonprehensile Pushing and Pulling with Dexterous Hands

Arxiv

0+阅读 · 4月6日

Concurrent Prehensile and Nonprehensile Manipulation: A Practical Approach to Multi-Stage Dexterous Tasks

Arxiv

0+阅读 · 3月12日

EquiBim: Learning Symmetry-Equivariant Policy for Bimanual Manipulation

Arxiv

0+阅读 · 3月9日

Unified Learning of Temporal Task Structure and Action Timing for Bimanual Robot Manipulation

Arxiv

0+阅读 · 3月6日

TwinVLA: Data-Efficient Bimanual Manipulation with Twin Single-Arm Vision-Language-Action Models

Arxiv

0+阅读 · 2月23日

相关基金

针对大规模环境下复杂任务的策略搜索强化学习方法研究

国家自然科学基金

43+阅读 · 2015年12月31日

防肌肉疲劳双臂机器人人机协同基础研究

国家自然科学基金

1+阅读 · 2015年12月31日

儿童手写运动促进中英文感知的认知神经机制

国家自然科学基金

0+阅读 · 2015年12月31日

仿人轻型机械臂人机协作模式关键技术研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于多个小型微惯性/磁强计测量单元的手势识别研究

国家自然科学基金

0+阅读 · 2015年12月31日

机器灵巧手基于触滑觉信息协同的自适应力控制方法研究

国家自然科学基金

3+阅读 · 2015年12月31日

基于记忆学习与免疫系统的仿生控制研究

国家自然科学基金

7+阅读 · 2015年12月31日

局部可视环境中基于视觉和触觉感知的灵巧手精细操作的方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

基于深度信息和显著计算的手势交互技术研究及应用

国家自然科学基金

1+阅读 · 2014年12月31日

周期性手工装配作业肌肉疲劳预测建模与其装配质量改善

国家自然科学基金

0+阅读 · 2014年12月31日

微信扫码咨询专知VIP会员