A robot providing mealtime assistance must perform specialized maneuvers with various utensils in order to pick up and feed a range of food items. Beyond these dexterous low-level skills, an assistive robot must also plan these strategies in sequence over a long horizon to clear a plate and complete a meal. Previous methods in robot-assisted feeding introduce highly specialized primitives for food handling without a means to compose them together. Meanwhile, existing approaches to long-horizon manipulation lack the flexibility to embed highly specialized primitives into their frameworks. We propose Visual Action Planning OveR Sequences (VAPORS), a framework for long-horizon food acquisition. VAPORS learns a policy for high-level action selection by leveraging learned latent plate dynamics in simulation. To carry out sequential plans in the real world, VAPORS delegates action execution to visually parameterized primitives. We validate our approach on complex real-world acquisition trials involving noodle acquisition and bimanual scooping of jelly beans. Across 38 plates, VAPORS acquires much more efficiently than baselines, generalizes across realistic plate variations such as toppings and sauces, and qualitatively appeals to user feeding preferences in a survey conducted across 49 individuals. Code, datasets, videos, and supplementary materials can be found on our website: https://sites.google.com/view/vaporsbot.
翻译:在提供进餐辅助时,机器人必须使用多种餐具执行精细操作,以拾取并喂食各种食物。除了这些灵巧的低层级技能外,辅助机器人还需在长时间跨度内规划这些策略的顺序执行,以清空餐盘并完成就餐。以往机器人辅助喂食的方法引入了高度专业化的食物处理基元,但缺乏将它们组合起来的机制。与此同时,现有的长时程操作框架缺乏将高度专业化基元嵌入其体系的灵活性。我们提出了视觉动作序列规划框架(VAPORS),一种用于长时程食物获取的框架。VAPORS通过利用模拟中学习的潜在餐盘动力学,学习高层级动作选择的策略。为在现实世界中执行顺序计划,VAPORS将动作执行委托给视觉参数化的基元。我们在涉及面条获取和双手舀取软糖豆等复杂真实获取试验中验证了该方法。在38个餐盘上的实验中,VAPORS的性能远优于基线方法,能泛化至浇头、酱汁等逼真餐盘变化,并在面向49名受试者的调研中在定性上符合用户的喂食偏好。代码、数据集、视频及补充材料可访问我们的网站:https://sites.google.com/view/vaporsbot。