Artificial agents that support human group interactions hold great promise, especially in sensitive contexts such as well-being promotion and therapeutic interventions. However, current systems struggle to mediate group interactions involving people who are not neurotypical. This limitation arises because most AI detection models (e.g., for turn-taking) are trained on data from neurotypical populations. This work takes a step toward inclusive AI by addressing the challenge of eye contact detection, a core component of non-verbal communication, with and for people with Intellectual and Developmental Disabilities. First, we introduce a new dataset, Multi-party Interaction with Intellectual and Developmental Disabilities (MIDD), capturing atypical gaze and engagement patterns. Second, we present the results of a comparative analysis with neurotypical datasets, highlighting differences in class imbalance, speaking activity, gaze distribution, and interaction dynamics. Then, we evaluate classifiers ranging from SVMs to FSFNet, showing that fine-tuning on MIDD improves performance, though notable limitations remain. Finally, we present the insights gathered through a focus group with six therapists to interpret our quantitative findings and understand the practical implications of atypical gaze and engagement patterns. Based on these results, we discuss data-driven strategies and emphasize the importance of feature choice for building more inclusive human-centered tools.
翻译:支持人类群体交互的人工智能体具有巨大潜力,尤其在促进福祉和治疗干预等敏感场景中。然而,当前系统难以有效协调涉及非神经典型人群的群体互动。这一局限源于大多数人工智能检测模型(如用于轮转检测的模型)均基于神经典型人群的数据进行训练。本研究通过应对眼神接触检测这一非语言交流核心组件的挑战,向包容性人工智能迈进一步,并特别关注智力与发育障碍人群的需求。首先,我们引入了一个新数据集——智力与发育障碍群体多边交互数据集,该数据集捕捉了非典型的注视与参与模式。其次,我们呈现了与神经典型数据集的对比分析结果,揭示了在类别不平衡、言语活动、注视分布及交互动态等方面的显著差异。随后,我们评估了从支持向量机到FSFNet的一系列分类器,结果表明在MIDD数据集上的微调能提升模型性能,但仍存在明显局限。最后,我们通过由六位治疗师参与的焦点小组收集的洞见,阐释定量研究结果并理解非典型注视与参与模式的实际意义。基于这些发现,我们讨论了数据驱动的策略,并强调特征选择对于构建更具包容性的人本工具的重要性。