For a robot to be called socially intelligent, it must be able to infer users internal states from their current behaviour, predict the users future behaviour, and if required, respond appropriately. In this work, we investigate how robots can be endowed with such social intelligence by modelling the dynamic relationship between user's internal states (latent) and actions (observable state). Our premise is that these states arise from the same underlying socio-cognitive process and influence each other dynamically. Drawing inspiration from theories in Cognitive Science, we propose a novel multi-task learning framework, termed as \textbf{SocialLDG} that explicitly models the dynamic relationship among the states represent as six distinct tasks. Our framework uses a language model to introduce lexical priors for each task and employs dynamic graph learning to model task affinity evolving with time. SocialLDG has three advantages: First, it achieves state-of-the-art performance on two challenging human-robot social interaction datasets available publicly. Second, it supports strong task scalability by learning new tasks seamlessly without catastrophic forgetting. Finally, benefiting from explicit modelling task affinity, it offers insights on how different interactions unfolds in time and how the internal states and observable actions influence each other in human decision making.
翻译:为使机器人具备社交智能,必须能够从用户当前行为推断其内部状态,预测未来行为,并在需要时做出适当响应。本研究探索如何通过建模用户内部状态(潜在变量)与行为(可观测状态)之间的动态关系,赋予机器人此类社交智能。我们的前提是这些状态源于同一社会认知过程,且彼此动态影响。受认知科学理论启发,我们提出名为\textbf{SocialLDG}的新型多任务学习框架,显式建模六种不同任务状态间的动态关系。该框架利用语言模型为每项任务引入词汇先验,并通过动态图学习建模随时间演变的任务关联性。SocialLDG具有三大优势:首先,在两个公开的人类-机器人社交交互挑战性数据集上达到最优性能;其次,支持强大的任务扩展性,可无缝学习新任务且不会发生灾难性遗忘;最后,得益于任务关联性的显式建模,它揭示了不同交互随时间展开的机制,以及内部状态与可观测行为在人类决策过程中如何相互影响。