Estimating cognitive load from speech has largely been studied in controlled laboratory settings, with limited understanding of its reliability in natural collaborative conversations. We investigate whether speech and interaction dynamics predict perceived cognitive load during dyadic conversations. We analyze audio from 53 dyads performing nine collaborative tasks and extract static acoustic, dynamic, and interaction features to train a two-head Gated Recurrent Unit encoder to predict cognitive load scores. Results show conversational interaction provides useful signals for predicting cognitive load related to time pressure, mental work, effort, and task performance. Temporal demand is associated with turn-taking dynamics such as overlap and speaker switch, while mental demand is linked to imbalanced participation between speakers. These findings highlight the importance of task structure and conversational interaction for modeling cognitive load in natural collaborative settings.
翻译:认知负荷的言语估计主要在受控实验室环境中进行研究,对其在自然协作对话中的可靠性认知有限。本研究探讨了言语与交互动态能否预测双人对话中的感知认知负荷。我们分析了53组双人在完成九项协作任务时的音频数据,提取静态声学、动态及交互特征,训练双头门控循环单元编码器以预测认知负荷评分。结果表明,对话交互为预测与时间压力、脑力劳动、努力程度及任务表现相关的认知负荷提供了有效信号。时间需求与话轮转换动态(如重叠与说话者切换)相关,而脑力需求则与说话者之间的参与不均衡有关。这些发现凸显了任务结构与对话交互在自然协作情境中建模认知负荷的重要性。