For a robot to be called socially intelligent, it must be able to infer users internal states from their current behaviour, predict the users future behaviour, and if required, respond appropriately. In this work, we investigate how robots can be endowed with such social intelligence by modelling the dynamic relationship between user's internal states (latent) and actions (observable state). Our premise is that these states arise from the same underlying socio-cognitive process and influence each other dynamically. Drawing inspiration from theories in Cognitive Science, we propose a novel multi-task learning framework, termed as \textbf{SocialLDG} that explicitly models the dynamic relationship among the states represent as six distinct tasks. Our framework uses a language model to introduce lexical priors for each task and employs dynamic graph learning to model task affinity evolving with time. SocialLDG has three advantages: First, it achieves state-of-the-art performance on two challenging human-robot social interaction datasets available publicly. Second, it supports strong task scalability by learning new tasks seamlessly without catastrophic forgetting. Finally, benefiting from explicit modelling task affinity, it offers insights on how different interactions unfolds in time and how the internal states and observable actions influence each other in human decision making.
翻译:为使机器人具备社会智能,它必须能够从用户当前行为推断其内在状态,预测用户未来行为,并在必要时做出适当响应。本研究通过建模用户内在状态(潜变量)与动作(可观察状态)之间的动态关系,探讨如何赋予机器人此类社会智能。我们的前提是这些状态源自同一社会认知过程,且相互动态影响。受认知科学理论启发,我们提出名为**SocialLDG**的新型多任务学习框架,该框架显式建模六个不同任务间状态的动态关系。该框架利用语言模型引入任务专属词汇先验,并通过动态图学习建模随时间演变的任务亲和度。SocialLDG具有三大优势:首先,在两个公开的具有挑战性的人机社交交互数据集上达到最优性能;其次,通过无灾难性遗忘的无缝新任务学习,支持强任务可扩展性;最后,受益于显式建模任务亲和度,该框架揭示了不同交互随时间展开的方式,以及内在状态与可观察动作在人类决策中的相互影响机制。