OmniDexGrasp：通过基础模型与力反馈实现可泛化的灵巧抓取 (OmniDexGrasp: Generalizable Dexterous Grasping via Foundation Model and Force Feedback)

Enabling robots to dexterously grasp and manipulate objects based on human commands is a promising direction in robotics. However, existing approaches are challenging to generalize across diverse objects or tasks due to the limited scale of semantic dexterous grasp datasets. Foundation models offer a new way to enhance generalization, yet directly leveraging them to generate feasible robotic actions remains challenging due to the gap between abstract model knowledge and physical robot execution. To address these challenges, we propose OmniDexGrasp, a generalizable framework that achieves omni-capabilities in user prompting, dexterous embodiment, and grasping tasks by combining foundation models with the transfer and control strategies. OmniDexGrasp integrates three key modules: (i) foundation models are used to enhance generalization by generating human grasp images supporting omni-capability of user prompt and task; (ii) a human-image-to-robot-action transfer strategy converts human demonstrations into executable robot actions, enabling omni dexterous embodiment; (iii) force-aware adaptive grasp strategy ensures robust and stable grasp execution. Experiments in simulation and on real robots validate the effectiveness of OmniDexGrasp on diverse user prompts, grasp task and dexterous hands, and further results show its extensibility to dexterous manipulation tasks.

翻译：使机器人能够根据人类指令灵巧地抓取和操纵物体是机器人学中一个前景广阔的方向。然而，由于语义化灵巧抓取数据集的规模有限，现有方法难以在不同物体或任务间实现泛化。基础模型为增强泛化能力提供了新途径，但由于抽象模型知识与物理机器人执行之间存在鸿沟，直接利用它们生成可行的机器人动作仍然具有挑战性。为应对这些挑战，我们提出了OmniDexGrasp，这是一个可泛化的框架，通过将基础模型与迁移和控制策略相结合，实现了在用户指令、灵巧具身和抓取任务方面的全方位能力。OmniDexGrasp集成了三个关键模块：(i) 利用基础模型生成支持全方位用户指令与任务能力的人类抓取图像，以增强泛化性；(ii) 一种从人类图像到机器人动作的迁移策略，将人类示范转化为可执行的机器人动作，实现全方位的灵巧具身；(iii) 力感知自适应抓取策略，确保稳健且稳定的抓取执行。在仿真和真实机器人上的实验验证了OmniDexGrasp在不同用户指令、抓取任务和灵巧手方面的有效性，进一步结果表明其可扩展至灵巧操作任务。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

《生成式模型: 变分自编码器与扩散模型》，75页ppt，Google DeepMind科学家Ruiqi Gao

专知会员服务

66+阅读 · 2023年6月10日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【CHI2020-微软】解释可解释性:理解数据科学家使用机器学习的可解释性工具，Interpreting Interpretability: Understanding Data Scientists’Use of Interpretability Tools for Machine Learning

专知会员服务

55+阅读 · 2020年3月8日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日