Large language models (LLMs) can provide rich physical descriptions of most worldly objects, allowing robots to achieve more informed and capable grasping. We leverage LLMs' common sense physical reasoning and code-writing abilities to infer an object's physical characteristics--mass $m$, friction coefficient $\mu$, and spring constant $k$--from a semantic description, and then translate those characteristics into an executable adaptive grasp policy. Using a current-controllable, two-finger gripper with a built-in depth camera, we demonstrate that LLM-generated, physically-grounded grasp policies outperform traditional grasp policies on a custom benchmark of 12 delicate and deformable items including food, produce, toys, and other everyday items, spanning two orders of magnitude in mass and required pick-up force. We also demonstrate how compliance feedback from DeliGrasp policies can aid in downstream tasks such as measuring produce ripeness. Our code and videos are available at: https://deligrasp.github.io
翻译:大型语言模型(LLMs)能够提供对大多数日常物体的丰富物理描述,从而使机器人能够实现更具信息性和能力性的抓取操作。我们利用LLMs的常识物理推理与代码生成能力,从语义描述中推断物体的物理特性——质量$m$、摩擦系数$\mu$和弹簧常数$k$,并将这些特性转化为可执行的自适应抓取策略。借助配备内置深度相机的电流可控双指夹爪,我们实验证明:在包含12种精致易变形物品(涵盖食品、农产品、玩具及其他日常用品,质量及所需拾取力跨越两个数量级)的定制基准测试中,基于LLM生成的物理感知抓取策略优于传统抓取策略。我们还展示了DeliGrasp策略的柔顺反馈如何辅助下游任务(如测量农产品成熟度)。相关代码与视频详见:https://deligrasp.github.io