For a robot to personalize physical assistance effectively, it must learn user preferences that can be generally reapplied to future scenarios. In this work, we investigate personalization of household cleanup with robots that can tidy up rooms by picking up objects and putting them away. A key challenge is determining the proper place to put each object, as people's preferences can vary greatly depending on personal taste or cultural background. For instance, one person may prefer storing shirts in the drawer, while another may prefer them on the shelf. We aim to build systems that can learn such preferences from just a handful of examples via prior interactions with a particular person. We show that robots can combine language-based planning and perception with the few-shot summarization capabilities of large language models (LLMs) to infer generalized user preferences that are broadly applicable to future interactions. This approach enables fast adaptation and achieves 91.2% accuracy on unseen objects in our benchmark dataset. We also demonstrate our approach on a real-world mobile manipulator called TidyBot, which successfully puts away 85.0% of objects in real-world test scenarios.
翻译:为使机器人实现有效的个性化物理辅助,它需要学习能够泛化应用于未来场景的用户偏好。本研究探索了基于机器人的家庭环境清理个性化方案,使其能够通过拾取并归置物品来整理房间。一个核心挑战在于确定每件物品的合理存放位置——因为受个人品味或文化背景影响,人们的偏好可能存在显著差异。例如,有人偏爱将衬衫放入抽屉,而另一些人则更倾向于放置在架子上。我们旨在构建能够通过特定用户的少量交互示例学习此类偏好的系统。研究表明,机器人可结合基于语言的理解与规划能力,以及大语言模型(LLM)的少样本总结能力,推断出可广泛适用于未来交互场景的泛化用户偏好。该方法实现了快速适配,并在我们的基准数据集上对未见物品达到了91.2%的准确率。我们还通过在名为TidyBot的真实移动操纵平台上进行验证,该系统在真实场景测试中成功归置了85.0%的物品。