A robot providing mealtime assistance must perform specialized maneuvers with various utensils in order to pick up and feed a range of food items. Beyond these dexterous low-level skills, an assistive robot must also plan these strategies in sequence over a long horizon to clear a plate and complete a meal. Previous methods in robot-assisted feeding introduce highly specialized primitives for food handling without a means to compose them together. Meanwhile, existing approaches to long-horizon manipulation lack the flexibility to embed highly specialized primitives into their frameworks. We propose Visual Action Planning OveR Sequences (VAPORS), a framework for long-horizon food acquisition. VAPORS learns a policy for high-level action selection by leveraging learned latent plate dynamics in simulation. To carry out sequential plans in the real world, VAPORS delegates action execution to visually parameterized primitives. We validate our approach on complex real-world acquisition trials involving noodle acquisition and bimanual scooping of jelly beans. Across 38 plates, VAPORS acquires much more efficiently than baselines, generalizes across realistic plate variations such as toppings and sauces, and qualitatively appeals to user feeding preferences in a survey conducted across 49 individuals. Code, datasets, videos, and supplementary materials can be found on our website: https://sites.google.com/view/vaporsbot.
翻译:在进餐辅助过程中,机器人需运用多种餐具执行精细化操作,以拾取并喂食各类食物。除这些灵巧的低级技能外,辅助机器人还需在长时域内按序规划策略,以清空餐盘并完成进食任务。此前机器人辅助进食方法虽引入了高度专业化的食物操作基元,却缺乏组合这些基元的手段。与此同时,现有长时域操控方法在将此类高度专业化基元嵌入自身框架时缺乏灵活性。我们提出视觉动作序列规划(VAPORS)框架,专用于长时域食物获取任务。VAPORS通过仿真环境中学习到的潜层餐盘动力学模型,习得高层动作选择策略。为在真实世界执行序贯计划,VAPORS将动作执行委托给视觉参数化基元。我们在涉及面条获取与双手舀取果冻豆的复杂真实场景获取试验中验证了该方法。在38组餐盘测试中,VAPORS的获取效率显著优于基线方法,能泛化至浇头与酱料等真实餐盘变化场景,且在涵盖49名受试者的调查中符合用户进食偏好。代码、数据集、视频及补充材料详见项目网站:https://sites.google.com/view/vaporsbot。