The integration of conversational agents into our daily lives has become increasingly common, yet many of these agents cannot engage in deep interactions with humans. Despite this, there is a noticeable shortage of datasets that capture multimodal information from human-robot interaction dialogues. To address this gap, we have developed a Personal Emotional Robotic Conversational sYstem (PERCY) and recorded a novel multimodal dataset that encompasses rich embodied interaction data. The process involved asking participants to complete a questionnaire and gathering their profiles on ten topics, such as hobbies and favourite music. Subsequently, we initiated conversations between the robot and the participants, leveraging GPT-4 to generate contextually appropriate responses based on the participant's profile and emotional state, as determined by facial expression recognition and sentiment analysis. Automatic and user evaluations were conducted to assess the overall quality of the collected data. The results of both evaluations indicated a high level of naturalness, engagement, fluency, consistency, and relevance in the conversation, as well as the robot's ability to provide empathetic responses. It is worth noting that the dataset is derived from genuine interactions with the robot, involving participants who provided personal information and conveyed actual emotions.
翻译:随着对话智能体日益融入日常生活,其与人类进行深度交互的能力仍普遍不足。然而,目前能够捕捉人机交互对话中多模态信息的数据集仍明显匮乏。为填补这一空白,我们开发了个人情感机器人对话系统(PERCY),并录制了一个包含丰富具身交互数据的新型多模态数据集。研究过程中,我们邀请参与者填写问卷并收集其在十个主题(如兴趣爱好与偏好的音乐)上的个人画像。随后,机器人基于参与者的个人画像及通过面部表情识别与情感分析判定的情绪状态,利用GPT-4生成符合情境的回应,从而开启与参与者的对话。我们通过自动评估与用户评估对采集数据的整体质量进行了检验。两项评估结果均表明:对话在自然度、参与感、流畅性、一致性及相关性方面表现优异,且机器人展现出共情回应的能力。值得注意的是,该数据集源自参与者提供真实个人信息并表达实际情感状态下与机器人的真实交互过程。