For a robot to personalize physical assistance effectively, it must learn user preferences that can be generally reapplied to future scenarios. In this work, we investigate personalization of household cleanup with robots that can tidy up rooms by picking up objects and putting them away. A key challenge is determining the proper place to put each object, as people's preferences can vary greatly depending on personal taste or cultural background. For instance, one person may prefer storing shirts in the drawer, while another may prefer them on the shelf. We aim to build systems that can learn such preferences from just a handful of examples via prior interactions with a particular person. We show that robots can combine language-based planning and perception with the few-shot summarization capabilities of large language models (LLMs) to infer generalized user preferences that are broadly applicable to future interactions. This approach enables fast adaptation and achieves 91.2% accuracy on unseen objects in our benchmark dataset. We also demonstrate our approach on a real-world mobile manipulator called TidyBot, which successfully puts away 85.0% of objects in real-world test scenarios.
翻译:为了让机器人有效实现个性化物理辅助,它必须学习能够泛化应用于未来场景的用户偏好。在本研究中,我们探索了机器人居家清洁的个性化方案——机器人通过拾取并收纳物品来整理房间。其中关键挑战在于确定每种物品的合适放置位置,因为人们的偏好可能因个人品味或文化背景存在显著差异。例如,有人偏好将衬衫放入抽屉,而另一些人则更倾向将其置于架子上。我们致力于构建能通过与特定用户的前期交互,仅凭少量示例即可学习此类偏好的系统。研究表明,机器人能够将基于语言的规划与感知能力,与大语言模型(LLMs)的少样本摘要能力相结合,从而推断出可广泛适用于未来交互的泛化用户偏好。该方法实现了快速适应,并在我们的基准数据集上对未见物体达到了91.2%的准确率。我们还在一款名为TidyBot的真实移动机械臂上验证了该方法,该机器人在真实场景测试中成功收纳了85.0%的物体。