Friendship and rapport play an important role in the formation of constructive social interactions, and have been widely studied in educational settings due to their impact on student outcomes. Given the growing interest in automating the analysis of such phenomena through Machine Learning (ML), access to annotated interaction datasets is highly valuable. However, no dataset on dyadic child-child interactions explicitly capturing rapport currently exists. Moreover, despite advances in the automatic analysis of human behaviour, no previous work has addressed the prediction of rapport in child-child dyadic interactions in educational settings. We present UpStory -- the Uppsala Storytelling dataset: a novel dataset of naturalistic dyadic interactions between primary school aged children, with an experimental manipulation of rapport. Pairs of children aged 8-10 participate in a task-oriented activity: designing a story together, while being allowed free movement within the play area. We promote balanced collection of different levels of rapport by using a within-subjects design: self-reported friendships are used to pair each child twice, either minimizing or maximizing pair separation in the friendship network. The dataset contains data for 35 pairs, totalling 3h 40m of audio and video recordings. It includes two video sources covering the play area, as well as separate voice recordings for each child. An anonymized version of the dataset is made publicly available, containing per-frame head pose, body pose, and face features; as well as per-pair information, including the level of rapport. Finally, we provide ML baselines for the prediction of rapport.
翻译:友谊与融洽关系在构建性社会互动的形成中扮演着重要角色,并因其对学生学业成果的影响而在教育情境中受到广泛研究。随着通过机器学习(ML)自动化分析此类现象的兴趣日益增长,获取带标注的互动数据集具有极高价值。然而,目前尚不存在明确捕捉融洽关系的儿童间二元互动数据集。此外,尽管人类行为自动分析技术已取得进展,但此前未有研究针对教育情境中儿童间二元互动的融洽关系预测问题。本文提出UpStory——乌普萨拉叙事数据集:一个包含小学适龄儿童自然主义二元互动的新型数据集,其中通过实验设计对融洽关系进行操控。年龄在8-10岁的儿童两人一组参与任务导向活动:共同设计故事,同时允许在游戏区域内自由移动。我们采用被试内设计以促进不同融洽关系水平的均衡采集:利用自我报告的友谊关系使每个儿童参与两次配对,分别在友谊网络中最小化或最大化配对间隔。该数据集包含35组配对,总计3小时40分钟的音频与视频记录,涵盖游戏区域的两个视频源及每位儿童的独立语音录音。我们公开提供数据集的匿名版本,包含逐帧头部姿态、身体姿态与面部特征,以及每组配对的融洽关系等级信息。最后,我们为融洽关系预测提供了机器学习基线模型。