The emergence of large language models (LLM) and, consequently, vision language models (VLM) has ignited new imaginations among robotics researchers. At this point, the range of applications to which LLM and VLM can be applied in human-robot interaction (HRI), particularly socially assistive robots (SARs), is unchartered territory. However, LLM and VLM present unprecedented opportunities and challenges for SAR integration. We aim to illuminate the opportunities and challenges when roboticists deploy LLM and VLM in SARs. First, we conducted a meta-study of more than 250 papers exploring 1) major robots in HRI research and 2) significant applications of SARs, emphasizing education, healthcare, and entertainment while addressing 3) societal norms and issues like trust, bias, and ethics that the robot developers must address. Then, we identified 4) critical components of a robot that LLM or VLM can replace while addressing the 5) benefits of integrating LLM into robot designs and the 6) risks involved. Finally, we outline a pathway for the responsible and effective adoption of LLM or VLM into SARs, and we close our discussion by offering caution regarding this deployment.
翻译:大语言模型(LLM)以及随之兴起的视觉语言模型(VLM)的出现,激发了机器人研究者的新想象。目前,LLM和VLM在人机交互(HRI),特别是社交辅助机器人(SARs)领域的应用范围尚属未知领域。然而,LLM和VLM为SAR的集成带来了前所未有的机遇与挑战。本文旨在阐明机器人专家在SAR中部署LLM和VLM时所面临的机遇与挑战。首先,我们对超过250篇论文进行了元研究,探讨了:1)HRI研究中的主要机器人类型;2)SAR的重要应用领域,重点关注教育、医疗和娱乐,同时讨论了3)机器人开发者必须应对的社会规范与问题,如信任、偏见和伦理。随后,我们识别了:4)LLM或VLM可以替代的机器人关键组件;5)将LLM集成到机器人设计中的益处;以及6)所涉及的风险。最后,我们概述了在SAR中负责任且有效地采用LLM或VLM的路径,并在讨论结束时对此类部署提出了警示。