Large language models (LLMs) can provide rich physical descriptions of most worldly objects, allowing robots to achieve more informed and capable grasping. We leverage LLMs' common sense physical reasoning and code-writing abilities to infer an object's physical characteristics--mass $m$, friction coefficient $\mu$, and spring constant $k$--from a semantic description, and then translate those characteristics into an executable adaptive grasp policy. Using a current-controllable, two-finger gripper with a built-in depth camera, we demonstrate that LLM-generated, physically-grounded grasp policies outperform traditional grasp policies on a custom benchmark of 12 delicate and deformable items including food, produce, toys, and other everyday items, spanning two orders of magnitude in mass and required pick-up force. We also demonstrate how compliance feedback from DeliGrasp policies can aid in downstream tasks such as measuring produce ripeness. Our code and videos are available at: https://deligrasp.github.io
翻译:大语言模型(LLMs)能够提供大多数日常物体的丰富物理描述,使机器人能够实现更智能、更可靠的抓取。我们利用LLMs的常识物理推理与代码生成能力,从语义描述中推断物体的物理特性——质量$m$、摩擦系数$\mu$和弹簧常数$k$——并将这些特性转化为可执行的自适应抓取策略。通过搭载内置深度相机的电流可控双指夹爪,我们在包含12种精致易变形物品(涵盖食品、农产品、玩具及其他日常用品,质量与所需拾取力跨越两个数量级)的自定义基准测试中证明:基于LLM生成的物理驱动抓取策略优于传统抓取策略。我们还展示了DeliGrasp策略的柔顺性反馈如何助力下游任务(如农产品成熟度测量)。相关代码与视频详见:https://deligrasp.github.io